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AMENDMENTS 

This listing of claims will replace all prior versions, and listings, of claims in the 
application. 

Listing of Claims : 

Claims 1-48 (Canceled). 

49. (currently amended) A method performed in a computer of simulating a 
metabolic capability of an in silico strain of a microbe, comprising: 

obtaining a plurality of DNA sequences comprising most of metabolic genes in a genome 
of the microbe to produce an in silico representation of a microbe; 

determining open reading frames of genes of unknown function in the microbe in said 
plurality of DNA sequences; 

assigning a function to proteins encoded by said open reading frames by determining the 
homology of said open reading frames to gene sequences encoding proteins of known function in 
a different organism; 

determining which of said open reading frames correspond to metabolic genes by 
determining if the assigned function of said proteins is involved in cellular metabolism; 

determining substrates, products and stoichiometry of the reaction for each of the gene 
products of said metabolic genes; 

producing a genome specific stoichiometric matrix of said microbe produced by 
incorporating4 reffi said substrates, products and stoichiometr y into a stoichiometric matrix ; 

determining a metabolic demand corresponding to a biomass composition of said 
microbe; 

calculating uptake rates of metabolites of said microbe; 

combining said metabolic demands and said uptake rates with said stoichiometric matrix 
to produce an in silico representation of said microbe; 

incorporating a general linear programming problem to produce an in silico strain of said 
microbe; 

performing a flux balance analysis on said in silico strain, and 
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providing a visual output to a user of said analysis that simulates a metabolic capability 
of said strain predictive of said microbe's phenotype . 

50. (previously presented) The method of claim 49, wherein said microbe is 
Escherichia coli. 

51. (previously presented) The method of claim 49, wherein said genes involved in 
cellular metabolism comprise genes involved in central metabolism, amino acid metabolism, 
nucleotide metabolism, fatty acid metabolism, lipid metabolism, vitamin and cofactor 
biosynthesis, energy and redox generation or carbohydrate assimilation. 

52. (previously presented) The method of claim 49, wherein assigning a function 
comprises performing a homology search using the Basic Local Alignment Search Tool 
(BLAST). 

Claims 53-55 (canceled). 

56. (previously presented) The method of claim 49, wherein said uptake rates are 
calculated by measuring the depletion of substrate from growth media of said microbe. 

57. (currently amended) A method performed in a computer for simulating a 
metabolic capability of an in silico strain of a microbe, comprising: 

a) providing a nucleotide sequence of a metabolic gene in the microbe; 

b) determining substrates, products and stoichiometry of the reaction for the gene 
product of said metabolic gene, wherein said gene product having an unknown function in the 
microbe is assigned a function by determining homology of said nucleotide sequence to gene 
sequences encoding gene products of known function in a different organism; 

c) repeating steps a) and b) for most metabolic genes of said microbe to produce an 
in silico representation; 

d) producing a genome specific stoichiometric matrix produced by incorporating 
f-reffi said substrates, products and stoichiometry of the metabolic gene products in said microbe 
into a stoichiometric matrix ; 

e) determining a metabolic demand corresponding to a biomass composition of said 
microbe; 
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f) calculating uptake rates of metabolites of said microbe; 

g) combining said metabolic demands and said uptake rates with said stoichiometric 
matrix to produce an in silico representation of said microbe; 

h) incorporating a general linear programming problem to produce an in silico strain 
of said microbe; 

i) performing a flux balance analysis on said in silico strain; and 

j) providing a visual output to a user of said analysis that simulates a metabolic 
capability of said strain predictive of said microbe's phenotype . 

58. (previously presented) The method of claim 57, wherein the microbe is 

Escherichia coli. 

59. (previously presented) The method of claim 57, wherein said metabolic gene is 
selected from the group consisting of: genes involved in central metabolism, amino acid 
metabolism, nucleotide metabolism, fatty acid metabolism, lipid metabolism, vitamin and 
cofactor biosynthesis, energy and redox generation and carbohydrate assimilation. 

60. (previously presented) The method of claim 57, wherein assigning a function 
comprises performing a homology search using the Basic Local Alignment Search Tool 
(BLAST). 

Claim 61-63 (canceled) . 

64. (previously presented) The method of claim 57, wherein said uptake rates are 
calculated by measuring the depletion of substrate from growth media of said microbe. 

Claim 65-67 (canceled). 

68. (previously presented) The method of claim 51, wherein said genes are involved in 
central metabolism. 

69. (previously presented) The method of claim 51, wherein said genes are involved in 
amino acid metabolism. 
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70. (previously presented) The method of claim 51, wherein said genes are involved in 
nucleotide metabolism. 

71. (previously presented) The method of claim 51, wherein said genes are involved in 
fatty acid metabolism. 

72. (previously presented) The method of claim 51, wherein said genes are involved in 
lipid metabolism. 

73. (previously presented) The method of claim 51, wherein said genes are involved in 
vitamin and cofactor biosynthesis. 

74. (previously presented) The method of claim 51, wherein said genes are involved in 
energy and redox generation. 

75. (previously presented) The method of claim 51, wherein said genes are involved in 
carbohydrate assimilation. 

76. (previously presented) The method of claim 59, wherein said genes are involved in 
central metabolism. 

77. (previously presented) The method of claim 59, wherein said genes are involved in 
amino acid metabolism. 

78. (previously presented) The method of claim 59, wherein said genes are involved in 
nucleotide metabolism. 

79. (previously presented) The method of claim 59, wherein said genes are involved in 
fatty acid metabolism. 

80. (previously presented) The method of claim 59, wherein said genes are involved in 
lipid metabolism. 

81. (previously presented) The method of claim 59, wherein said genes are involved in 
vitamin and cofactor biosynthesis. 
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82. (previously presented) The method of claim 59, wherein said genes are involved in 
energy and redox generation. 

83. (previously presented) The method of claim 59, wherein said genes are involved in 
carbohydrate assimilation. 
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REMARKS 

Claims 49-52, 56-60, 64 and 68-83 are pending. Claims 49 and 57 have been amended. 
Support for the amendment can be found throughout the application as filed. Support for the 
amendment directed to incorporating substrates, products and stoichiometry into a stoichiometric 
matrix, can be found, for example, in the claims as originally filed, at page 8, line 1 1 through 
page 9, line 17, Example 1 and Figure 2. Support for the amendment directed to a metabolic 
capability of the in silico strain that is predictive of the microbe's phenotype, can be found at, for 
example, page 5, lines 9-11 and 15-17; page 13, lines 14-18; page 16, lines 4-6 and 8-9; page 17, 
lines 19-27 and Examples 3 and 4. Accordingly the amendments do not introduce new matter 
and entry thereof is respectfully requested. 

Interview Summary 

Applicant, representatives from Applicant's licensee Genomatica and counsel of record 
wish to thank Examiners Negin and Moran for the telephonic interview conducted on April 29, 
2009. Applicant, counsel and licensee's representatives discussed that one skilled in the art 
would not have combined the references cited under 35 U.S.C. § 103(a) to arrive at the claimed 
invention because the alleged motivation to do so is lacking. A proposed declaration directed to 
this point was discussed. 

Rejections Under 35 U.S.C. § 103 

Claims 49-52, 56-60 and 68-83 stand rejected under 35 U.S.C. § 103(a) as allegedly 
obvious over Pramanik et al., Biotech, and Bio engineering 56:398-421 (1997) in view of Blattner 
et al., Science 277:1453-69 (1997) and Kunst et al., Rev. in Microbiol. 142:905-12 (1991). The 
Examiner alleges that Pramanik et al. investigate a stoichiometric model of E. coli metabolism. 
The Examiner concedes that Pramanik et al. fail to teach obtaining a plurality of DNA sequences 
and determining which open reading frames correspond to metabolic genes. Blattner et al. 
allegedly describes mapping the E. coli genome and assigning function to proteins by 
determining similarity to proteins of known function. The Examiner concedes that neither 
Pramanik et al. or Blattner et al. teach assigning function to genes of unknown function based on 
homology to proteins in a different organism. Kunst et al. is alleged to describe sequencing of B. 
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substilis genome and express an interest in comparing the B. substilis and E. coli genomes. The 
Examiner concludes that it would have been obvious to one of ordinary skill in the art to modify 
the stoichiometric model of Pramanik et al. by using the complete genome sequence of Blattner 
et al. and the genome comparisons of Kunst et al. because (1) metabolism can be further 
analyzed; (2) the full sequence enables global approaches to understanding biological function 
and looking at the evolutionary history, and (3) homology comparisons allow for an analysis of 
evolutionary differences. 

The Supreme Court in KSR stated that "a patent composed of several elements is not 
proved obvious merely by demonstrating that each of its elements was, independently, known in 
the prior art." KSR Int'l. Co. v. Teleflex, Inc., et al, 127 S. Ct. 1727 (2007). The Supreme Court 
noted that inventions in most, if not all, instances rely upon building blocks "long since 
uncovered." Thus, claimed discoveries will generally be combinations of what is already known. 
KSR requires that an Examiner provide "some articulated reasoning with some rational 
underpinning to support the legal conclusion of obviousness." Id. at 1741. An Examiner must 
"identify a reason that would have prompted a person of ordinary skill in the relevant field to 
combine the elements in the way the claimed new invention does," Id. Furthermore, the 
Examiner must include an explanation of "the effects of demands known to the design 
community or present in the marketplace" and "the background knowledge possessed by a 
person having ordinary skill in the art." Id. It is respectfully submitted that the current Office 
Action falls short of providing the analysis described by the Supreme Court in KSR. 

Applicant submits that the conclusory statements put forth in the Office Action that it 
would have been obvious to modify the stoichiometric model of E. coli as described by Pramanik 
et al. with the genome sequence of Blattner et al. and the homology comparisons of Kunst et al. 
allegedly because (1) metabolism can be further analyzed; (2) the full sequence enables global 
approaches to understanding biological function and looking at the evolutionary history, and (3) 
homology comparisons allow for an analysis of evolutionary differences fail to include the 
requisite explanations required by KSR. For example, the required explanations of the effects of 
demands known to the design community or present in the marketplace and the background 
knowledge possessed by a person having ordinary skill in the art are lacking. Accordingly, the 
Examiner has failed to articulate a prima facie case identifying a reason that would have 
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prompted a person of ordinary skill in the relevant field to combine the elements in the way the 
claimed invention does. 

In its discussion of United States v. Adams, 383 U.S. 39, 40 (1966), the Court in KSR 
further indicated that the presence of unexpected results supports conclusions that the invention 
is not obvious to those skilled in the art. KSR v. Teleflex, 127 S.Ct. at 1740. 

Applicant respectfully submits that the skilled person would not have modified the 
stoichiometric model of Pramanik et al. with the genome sequence of Blattner et al. using the 
comparisons of Kunst et al. to determine functions of unknown proteins based on the motivation 
stated in the Office Action, or any other motivation in the knowledge of one of ordinary skill in 
the art or from the nature of the problem to be solved. 

As set forth previously of record, Pramanik et al. describe construction of a metabolic 
model using only biochemical data. While Pramanik et al. list some genes, the list is incomplete 
compared to the model and consists only of known genes encoding proteins with a known 
biochemical activity. There is no teaching, suggestion or hint of an incentive to use information 
other than know biochemical data, including gene or genomic information, to modify or expand 
the content of the model. Rather, Pramanik et al. state "[t]here was close agreement between the 
predicted and experimentally determined flux values" (page 410, col. 1, para. 3) and "[t]his 
metabolic model should be a useful tool for studying the effects of reengineering pathways" 
(page 410, col. 2, para. 2). Based on these statements, one skilled in the art would conclude that 
the Pramanik et al. model is successful and that there is no problem to be solved that would 
benefit by including metabolic reactions deduced from open reading frames of genes of unknown 
function. 

Blattner et al. report on the E. coli genome sequencing and similarly fail to provide any 
teaching, suggestion or motivation to combine genome sequence information with the 
biochemically-based model of Pramanik et al. Nor does the genome sequence described by 
Blattner et al. provide any general knowledge that would contain an incentive to motivate one 
skilled in the art to incorporate the described genomic information into a stoichiometric model 
because a function or homology match can not be assigned to a sizeable fraction of the genes. 
For example, Blattner et al. teach that 38% of the protein-coding genes have no attributable 
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function (abstract, lines 1-2; page 1458, col. 3, para. 1, lines 1-9, and Table 4) and that nearly 
60% of E. coli proteins have no match in any other complete genome that was considered (page 
1459, col. 2, para. 2, lines 1-3). 

Kunst et al. also fails to cure the deficiencies or provide any incentive to utilize open 
reading frames of genes of unknown function in a stoichiometric model because Kunst et al. 
admits that the majority of genes will have an unknown function. 

At the first level of analysis, the DNA sequence will lead to a complete catalogue 
of putative protein sequences. These are likely to fall into one of 3 categories: (1) 
those whose functions are known, (2) those which show similarities with proteins 
identified in other organisms and which may have similar though not necessarily 
identical function in B. subtilis, and (3) those, probably the majority, whose 
function is unknown at present . 

Id., page 205, para, bridging columns 1 and 2 (emphasis added). 

A teaching that a majority of genes will have an unknown function fails to provide any 
incentive or motivation for one to combine Pramanik et al. and Blattner et al. with Kunst et al. 
because it informs the skilled person in the art that there is a high likelihood that incorrect 
information will be incorporated into the model. Incorrect assignment and incorporation into a 
stoichiometric model of putative metabolic genes or genes with unknown function as a metabolic 
gene will lead to inaccurate fluxes and diminution in the ability of the model to correctly 
simulate or predict a phenotype of the microbial organism. Accordingly, both Blattner et al. and 
Kunst et al., while they report on sequenced genomes and comparative analysis, teach that one 
skilled in the art, upon a careful reading of Pramanik et al., Blattner et al. and Kunst et al., would 
not be motivated to combine these references to arrive at the invention as claimed because 
incorporation of incorrect information leading to a less predictive model is a likely possibility. 

Applicants submit herewith three Declarations by Drs. Keasling, Palsson and Nielsen, 
attached as Exhibits A, B and C , respectively. Dr. Keasling is the senior author on the primary 
reference cited above and a leader in metabolic modeling. Dr. Palsson is the inventor on the 
above-identified application and a pioneer in the field of stoichiometric models of metabolism. 
Dr. Nielsen is a prominent researcher in the field of metabolic models. Drs. Keasling and 
Palsson declare that one would not have expected the combination to result in a model that is 
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capable of accurately predicting a microbial phenotype. Dr. Palsson further declares that a 
prominent scientist in the field wrote a letter to him stating his disbelief that a stoichiometric 
model constructed primarily from genomic information would be expected to work. Dr. Nielsen 
declares that the prominent scientist also voiced his disbelief in a public forum. 

Dr. Keasling declares that at the time when the genome sequence of Blattner et al. was 
available, he did not consider utilizing sequence information to incorporate additional metabolic 
reactions into his model because the resulting in silico model would not have been expected to be 
predictive of an actual organism's metabolism, f7. Dr. Keasling further declares that his model, 
as described in Pramanik et al., would have been expected to produce a large number of 
inaccurate fluxes and lead to a model that is much less predictable if putative reactions were 
included based sequence homology comparisons, f7. Dr. Keasling also declares that 
identification and assignment of some open reading frames as putative metabolic enzymes was 
speculative and likely to result in incorporation of inaccurate information and loss of the 
resultant model's ability to predict a phenotype, 18. The ability of a model constructed from 
genomic sequence data to predict actual cellular metabolism was unexpected (ffl 7 and 8). 

Dr. Palsson declares that those skilled in the field of metabolic modeling and engineering 
would not have been motivated to incorporate the genomic data of Blattner et al. or sequence 
comparisons of Kunst et al. because the predictability of the resultant model would have been 
expected to be reduced due to incorporation of incorrect information, ffl 8-9. Dr. Palsson further 
declares that Dr. Bailey, a respected and prominent scientist in the art, did not believe that the 
construction of a metabolic model from genomic sequence information worked as claimed. 
However, if such model did work as claimed and provided predictable fluxes, the approach 
represents a very major advance, fl 1. This disbelief or alternative characterization as a 
breakthrough discovery is supported in a letter sent from Dr. Bailey to Dr. Palsson where Dr. 
Bailey expressly states that the model as claimed is difficult to believe, or alternatively, a 
breakthrough in the field, 1113-16. 

Dr. Nielsen declares from personal knowledge that Dr. Bailey publically opposed Dr. 
Palsson by voicing his belief that it was not possible to predict metabolic functions and cellular 
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physiology using stoichiometric models reconstructed from genomic information because the 
large degrees of freedom are likely to yield to false phenotypes, 18. 

Based on the attached Declarations and the accompanying remarks herein, Applicant 
respectfully submits that the cited references do not provide any suggestion or motivation to 
combine their teachings to arrive at the claimed invention. Nor do the cited references, 
determined from the vantage point of one skilled in the art, including their general knowledge 
and knowledge of the problem to be solved, provide any incentive that would have motivated 
one skilled in the art to modify the references or combine them to arrive at the invention as 
claimed. 

Claim 64 stands rejected under 35 U.S.C. § 103(a) as allegedly obvious over Pramanik et 
al. in view of Blattner et al., in view of Kunst et al. and further in view of Xie et al., TIBECH 
15:109-113 (1997). Pramanik et al., Blattner et al. and Kunst et al. are applied as described 
above. Xie et al. allegedly describes integrated approaches to the design of media and that the 
composition of growth medium and its depletion over time affects growth of cells. The 
Examiner alleges that it would have been obvious to one of ordinary skill in the art to modify the 
studies of E. coli of Pramanik et al., Blattner et al. and Kunst et al. by the nutrient depletion 
studies of Xie et al. because stronger media can be designed to enable better growth of cells. 

The rejection of claim 64 relies on the primary reference by Pramanik et al. in 
combination with Blattner et al. and Kunst et al. The deficiencies of this combination are 
detailed above and in the attached Declarations. The tertiary reference to Xie et al. does not 
address, much less cure these deficiencies, which are fatal to the instant obviousness rejection. 
Accordingly, Applicants respectfully request withdrawal of the rejection of claim 64 under 35 
U.S.C. § 103 as obvious over Pramanik et al. in view of Blattner et al. and Kunst et al. as applied 
to claims 49-52, 56-60 and 68-83, and further in view of Xie et al. 

Double Patenting 

Claims 49-52, 56-60 and 64 stand provisionally rejected under the judicially created 
doctrine of obviousness-type double patenting as allegedly obvious over claims 26-28, 30, 32, 
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35, 36, and 39-41 of copending application serial No. 11/980,199. The Examiner points out that 
Applicant's previous arguments are moot in view of the new ground of rejection. 

Applicant respectfully requests deferral of this provisional ground of rejection until such 
time that there is an indication of allowable subject matter. Applicant respectfully points out that 
application serial No. 1 1/980,199 has yet to receive any substantive prosecution. Per Applicant's 
previous response, should the subject application be deemed in conditions of allowance prior to 
application serial No. 11/980,199, Applicant respectfully requests that this provisional rejection 
be withdrawn in this earlier filed application and permit it to proceed to issuance without need of 
a terminal disclaimer. MPEP § 804(I)(B). 

CONCLUSION 

To the extent necessary, a petition for an extension of time under 37 C.F.R. 1.136 is 
hereby made. Please charge any shortage in fees due in connection with the filing of this paper, 
including extension of time fees, to Deposit Account 502624 and please credit any excess fees to 
such deposit account. 

Respectfully submitted, 

McDERMOTT WILL & EMERY LLP 

/David A. Gay/ 

David A. Gay 
Registration No. 39,200 

Please recognize our Customer No. 41552 
as our correspondence address. 



1 1682 El Camino Real, Suite 400 
San Diego, CA 92130 
Phone: 858.720.3300 DAG:cjh 
Facsimile: 858.720.7800 
Date: June 18, 2009 
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I, Jay D. Keasling, declare as follows: 

1 . I am a Professor in the Department of Chemical Engineering and Bioengineering 
at the University of California, Berkeley (UC Berkeley). I also hold the Hubbard Howe Jr. 
Distinguished Professor of Biochemical Engineering. I am the Acting Deputy Director of the 
Lawrence Berkeley National Laboratory and Synthetic Biology Engineering Research Center 
and am CEO of the Joint BioEnergy Institute. I joined the faculty of UC Berkeley in 1992 as an 
Assistant Professor. I became an Associate Professor in 1998 and was elevated to full professor 
in 2001 . I served as Vice Chair of the Department of Chemical Engineering from 1999-2000 and 
have served as the Director and an Executive Committee Member of the UC BioSTAR Program 
since 2000. 

2. Prior to joining the UC Berkeley faculty I obtained a Bachelors of Science 
majoring in chemistry and biology in 1986 from the University of Nebraska-Lincoln. I earned 
my Masters degree in 1988 and my Ph.D. in 1991, both in chemical engineering from the 
University of Michigan. From 1991-1992 I did a postdoctoral fellowship at Stanford University 
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in Biochemistry. A copy of my curriculum vitae and a list of publications is attached as Exhibit 
I. 

3. I am an inventor or co-inventor on at least four U.S. patents and 16 U.S. 
applications. I am a founder of Amyris Biotechnologies and serve as the Chair of its Scientific 
Advisory Board. Amyris focuses on the microbial production of renewable fuels. I also am a 
founder of LS9 and Codon Devices. I have been a member of Genomatica's Scientific Advisory 
Board for the past year. My accomplishments in the fields of chemical engineering and synthetic 
biology have been reported in Time and Newsweek, and Discover magazine named me as the 
Scientist of the Year in 2006 for my work in synthetic biology, including treatments for malaria, 
AIDS, and cancer as well as discoveries of new fuel resources. 

4. I am very familiar with stoichiometric models of metabolism and have read U.S. 
application serial no. 09/923,870, by Palsson. I also am very familiar with Dr. Palsson's work, 
including the publication that is the basis of this application (Edwards and Palsson, Proc. Natl. 
Acad. Sci. U.S.A., 97:5528-33 (2000)). I understand that the invention described in this 
application is directed to constructing genome specific stoichiometric matrices that can be 
utilized with flux balance analysis for modeling metabolism. The application claims, in part, a 
method of simulating a metabolic capability by incorporating metabolic reactions through the use 
of genome information to assign function to metabolic proteins of unknown function. 

5. I have read the Office Action mailed December 18, 2008. I understand that the 
claimed invention stands rejected for obviousness over the combination of references to 
Pramanik and Keasling., Biotech, and Bioengineering 56:398-421 (1997) in view of Blattner et 
al., Science 277:1453-69 (1997) and in view of Kunst et al., Rev. in Microbiol. 142:905-12 
(1991). The Examiner appears to rely on Pramanik and Keasling for describing a stoichiometric 
model of E. coli metabolism and then combines it with Blattner et al. and Kunst et al., reporting 
the sequencing of E. coli and B. subtilis genomes, respectively, to conclude obviousness. The 
sequencing papers are used to support the Examiner's argument that one would have expected to 
be able to determine the function of genes encoding proteins of unknown function based on 
sequence comparisons with a different organism. 
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6. Pramanik and Keasling, the primary reference cited in the above rejection, is a 
publication from my laboratory and I am very familiar with this work. At the time of Dr. 
Palsson's invention, the E. coli genomic sequence had become available and my laboratory was 
actively working with stoichiometric models, including the model described in Pramanik and 
Keasling. We did not consider incorporating additional reaction information into the model 
based on the genomic sequence results for at least two reasons. 

7. First, in silico models of metabolism such as that described in Pramanik and 
Keasling are complex computational models that are only as accurate as the information one 
includes in the model. There are a large number of metabolic enzymes encoded in the genome 
that are not used in metabolism. Incorporation of reactions based on genomic information would 
have included such unused enzymes and reactions in the model. One would have expected a 
large number of inaccurate fluxes to occur that would, in effect, travel everywhere throughout 
the network (i.e., wild or uncontrolled fluxes). As a result, the model would not have been 
predictive of an organism's metabolism and would have been expected to be much less accurate 
than the model described in Pramanik and Keasling. The fact that Dr. Palsson was able to 
construct a model incorporating reactions based primarily on genomic sequence information was 
surprising and unexpected because it did not result in wild fluxes nor decrease in accuracy 
compared to the Pramanik and Keasling model. Rather, the model yielded results that reflected 
actual cellular metabolism and was predictive despite the inclusion of more enzymes than what 
the network uses. 

8. Second, incorporating reactions based on homology comparisons of unknown 
genes with metabolic genes in other organisms also was expected to yield inaccurate results. 
Although sequence identity comparisons can be predictive there are examples where 
identifications have been incorrect. Therefore, identification and assignment of some open 
reading frames as a putative metabolic enzyme was speculative and likely resulted in some 
incorrect assignments. These incorrect assignments can result in the inclusion of multiple 
reactions carrying out the same reaction, inclusion of unused reactions and the inclusion of non- 
metabolic enzymes into the metabolic network. For the reasons described above, incorporation 
of such inaccurate information was expected to generate wild fluxes if incorporated into a model 
such as that described in Pramanik and Keasling. It was surprising that one could, in fact, 
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incorporate putative metabolic enzymes and produce results that are predictive of cellular 
metabolism. Hence, the actual result of the claimed method is unexpected because this method 
is able to accurately predict metabolism even though reactions are incorporated based on 
deductions from sequence comparisons. 

I hereby declare that all statements made herein of my own knowledge are true and that 
all statements made on information and belief are believed to be true; and further that these 
statements are made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that any such willful false statement may jeopardize the validity of the application or 
any patent issued thereon. 




12 June 2009 



Jay D. Keasling 



Date 
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124. Seminar, Georgia Tech University, Center for the Study of Systems Biology, Atlanta, 
GA, May 2, 2007. 

125. Seminar, Georgia Tech University, Department of Chemical Engineering, Atlanta, GA, 
May 3, 2007. 

1 26. Seminar, Northern California AIChE, Berkeley, CA, May 1 5, 2007. 
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127. Seminar, University of British Columbia, Michael Smith Laboratories, Vancouver, 
British Columbia, Canada, May 17, 2007. 

128. Seminar, Congressional Biomedical Research Caucus, Washington, D.C., May 23, 2007. 

129. Seminar, PARC Forum, Palo Alto Research Center, Palo Alto, CA, May 24, 2007. 

130. Seminar, Harvard University, Department of Chemistry, Cambridge, MA, May 3 1 , 2007. 

131. Seminar, Kavli Futures Symposium, Ilulissat, Greenland, June 13, 2007. 

1 32. Seminar, University of Manchester, Manchester Institute of Biotechnology, Manchester, 
UK, July 12, 2007. 

133. Presentation, Biochemical Engineering XV, Quebec City, Canada, July 12, 2007. 

1 34. Presentation, Natural Products Gordon Research Conference, Tilton, NH, July 25, 2007. 

135. Presentation, Society for Industrial Microbiology Meeting, Denver, CO, July 29, 2007. 

136. Presentation, Energy Modeling Forum, Workshop on Climate Impacts and Integrated 
Assessment, Snowmass, CO, August 1, 2007. 

137. Keynote Address, 10 th Functional Genomics Meeting on Synthetic Biology, Goteborg, 
Sweden, August 28, 2007. 

138. Presentation, KI International Symposium Future Design, Korean Advanced Institute for 
Science and Technology, Daejeon, Korea, September 6, 2007. 

139. Keynote Address, Enzyme Engineering XIX, Harrison Hot Springs, British Columbia, 
Canada, September 23, 2007. 

140. Presentation, Metabolic Engineering Meeting, Mathematical Biosciences Institute, Ohio 
State University, Columbus, OH, September 24, 2007. 

141. Keynote Address, Frontiers in Transgenesis, Danforth Center, St. Louis, OH, September 
28, 2007. 

142. Seminar, Rice University, Department of Bioengineering, Houston, TX, October 1 0, 
2007. 

143. Presentation, Malaria Forum, Bill & Melinda Gates Foundation, Seattle, WA, October 
17,2007. 

144. Presentation, PopITech, Camden, ME, October 20, 2007. 

145. Presentation, Energy Roundtable, Stanford University, Hoover Institute, Stanford, CA, 
November 20, 2007. 

146. Presentation, Biological and Environmental Research Advisory Committee (BERAC), 
Washington, DC, November 29, 2007. 

147. Harry S. Truman Award Lecture, Sandia National Laboratories, Albuquerque, NM, 
December 5, 2007. 

148. Presentation, International Conference on Cellular & Molcular Bioengineering, Nanyang 
Technological University, Singapore, December 10, 2007. 

149. Presentation, Symposium on Future Directions in Research at the Intersection of the 
Physical and Life Sciences (RIPLS), National Academy of Science, Washington, D.C., 
December 19, 2007. 

150. Keynote Address, Technology Innovation Conference, Novozymes, Copenhagen, 
Denmark, January 13, 2008. 

151. Presentation, US-EC Energy Symposium Exact Name, San Francisco, CA, February 22, 
2008. 

152. Keynote Address, 6 th TLL Life Sciences Symposium, Temasec Life Sciences 
Laboratories, Singapore National University, Singapore, January 25, 2007. 

1 53 . Presentation, Orinda Intermediate School, Orinda, CA, January 30, 2007. 

1 54. Keynote Address, 12 th Netherlands Biotechnology Conference, Ede, The Netherlands, 
March 14, 2007. 

155. Presentation, Symposium on Synthetic Biology, University of Arizona, Tucson, AZ, 
March 19, 2008. 

156. Seminar, Duke University, Department of Biochemistry, Durham, NC, March 21, 2008. 

157. Seminar, Reliance Life Sciences, Mumbai, India, March 28, 2008. 

158. Seminar, Council of Scientific and Industrial Research, New Dehli, India, March 30, 
2008. 

159. Seminar, University of Nevada, Reno, Department of Chemical Engineering, Reno, NV, 
April 7, 2008. 
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160. Seminar, University of California, Berkeley, Department of Mechanical Engineering, 
Berkeley, CA, March 10, 2008. 

161. Presentation, Targeting and Tinkering with Interaction Networks, Barcelona, Spain, April 
15, 2008. 

1 62 . Presentation, Institute for Systems Biology, Seattle, WA, April 2 1 , 2008 . 

1 63 . Seminar, University of Washington, Department of Bioengineering, Seattle, WA, April 
22, 2008. 

164. Seminar, Sangamo Biosciences, Richmond, CA, April 25, 2008. 

165. Presentation, Fifth Annual World Congress on Industrial Biotechnology & 
Bioprocessing, Chicago, IL, April 28, 2008. 

1 66. Seminar, California Institute of Technology, Department of Bioengineering, Pasadena, 
CA, May 5, 2008. 

167. Presentation, Khosla Ventures CEO Summit, location, May 7, 2008. 

168. Seminar, Scripps Research Institute, Department of Chemistry, La Jolla, CA, May 8, 
2008. 

169. Seminar, Novozymes, Davis, CA, May 12, 2008. 

1 70. Seminar, Harvard University Medical School, Department of Microbiology, May 27, 
2008. 

171. Presentation, Royal Society discussion on Synthetic Biology, London, UK, June 2, 2008. 

172. Presentation, Burrill & Company, San Francisco, CA, June 10, 2008. 

1 73 . Presentation, CITRIS-Copenhagen Research Conference on Climate and Energy, 
Copenhagen, Denmark, June 18, 2008. 

174. Presentation, 4 th European Plant Science Organization Conference, Cote d'Azur, France, 
June 26, 2008. 

175. Presentation, Gordon Research Conference on Enzymes, Coenzymes, and Metabolic 
Pathways, location, July 12, 2008. 

1 76. Presentation, 1 3 th Annual Human Genome Meeting: Genomics and the Future of 
Medicine, Hyderabad, India, September 28-30, 2008. 

177. 

178. Keynote Address: "Synthetic biology in pursuit of inexpensive, effective, anti-malarial 
drugs," EPSRC Centre for Synthetic Biology and Innovation, Imperial College, London, 
UK, May 12, 2009. 

Workshops, Panels, and Short Courses 

1 . Massachusetts Institute of Technology, Department of Chemical Engineering. August 10-14, 

1998. "Metabolic Engineering Short Course." 

2. AIChE workshop on Bioinformatics. Houston, TX. March 13-14, 1999. 

3. Massachusetts Institute of Technology, Department of Chemical Engineering. August 10-14, 

1999. "Metabolic Engineering Short Course." 

4. DARPA workshop on Metabolic Engineering. Washington, D.C. March 24 - 26, 2000. 

5. Lawrence Berkeley National Laboratory Workshop "Solar to Fuel - Future Challenges and 
Solutions", Berkeley, CA. March 28-29, 2005. 

6. 2005 Genomes to Life Program Workshop, Washington, DC. February 6-14, 2005. 

7. Intercollegiate Genetically Engineered Machine Competition (iGEM) 2005 Teacher's 
Workshop, Boston, MA. May 14-15, 2005. 

8. European Science Foundation Exploration Workshop, "Synthetic Biology: Constructing and 
Deconstructing Life" Arila, Spain. Oct. 13-16, 2005. 

Presentations at National or International Meetings 

1. J. D. Keasling, A. Joshi, and B. O. Palsson. 1987. "Towards rational design and 
exploitation of recombinant prokaryotic cells." 194th ACS National Meeting, New 
Orleans, LA. 

2. J. D. Keasling and B. O. Palsson. 1988. "Dynamics and control of vector replication." 
196th ACS National Meeting, Los Angeles, CA. 

3. J. D. Keasling and B. O. Palsson. 1989. "Design in bacterial plasmids." National AIChE 
Meeting, San Francisco, CA. 



Page 18 



4. J. D. Keasling, B. O. Palsson, and S. Cooper. 1990. "Cell-cycle-specific F'lac plasmid 
replication: regulation by cell size control of initiation." European Molecular Biology 
Organization Meeting on the Bacterial Cell Cycle, Collonges-La Rouge, France. 

5. J. D. Keasling, S. Cooper, and B. O. Palsson. 1990. "Dynamics and control of plasmid 
replication." AIChE National Meeting, Chicago, IL. 

6. S. Cooper and J. D. Keasling. 1991. "F plasmid replication: cell-cycle specificity, 
regulation by cell size control of initiation, and the relationship of different origins of 
replication to plasmid synthesis." Human Frontier Science Program Workshop on 
Regulatory Mechanisms of DNA Replication, Les Arcs, France. 

7. J. D. Keasling and S. Cooper. 1991. "Cell-cycle-specificity, regulation by cell-size 
control of initiation, and the relationship of different origins of replication to plasmid 
synthesis." American Society for Microbiology, Dallas, TX. 

8. S. Cooper and J. D. Keasling. 1991. "Synthesis and regulation of cytoplasm, DNA, cell 
surface, and plasmid during the bacterial division cycle." Cold Spring Harbor Symposium 
on Quantitative Biology, Cold Spring Harbor, NY. 

9. S. Cooper and J. D. Keasling. 1991. "Cell-cycle-specific F plasmid replication during 
the Escherichia coli division cycle: regulation of replication by cell size control of 
initiation." Gordon Conference on Extrachromosomal Elements. 

10. J. D. Keasling, S. Cooper, and B. O. Palsson. 1991. "Dynamics and Control of Bacterial 
Plasmid Replication." AIChE National Meeting, Los Angeles, CA. 

11. J. D. Keasling, B. O. Palsson, and S. Cooper. 1992. "Plasmid Replication during the 
Cell Cycle." Keystone Symposium on Molecular Mechanisms in DNA Replication and 
Recombination, Taos, NM. 

12. J. D. Keasling, L. Bertsch, A. Kornberg. 1993. "Guanosine pentaphosphate 
phosphohydrolase of Escherichia coli is a long-chain polyphosphatase." 205th ACS 
National Meeting, Denver, CO. 

13. J. D. Keasling, S. T. Sharfstein, B. Deaton, G. Hupf. 1993. "Engineering and phosphate 
and energy metabolism in micro-organisms." Biochemical Engineering VIII, Princeton, 
NJ. 

14. D. G. Bolesch and J. D. Keasling. 1993. "Anaerobic bioremediation of TCE 
contamination in groundwater." Zeneca Process Technology Conference, Leeds, UK. 

15. S. T. Sharfstein, B. Deaton, J. D. Keasling. 1993. "Engineering and phosphate and 
energy metabolism in micro-organisms." 207th American Chemical Society National 
Meeting, San Diego, CA 

16. J. D. Keasling, H. Kuo, and G. Vahanian. 1994. "A probabilistic representation of the 
Escherichia coli cell cycle." AIChE National Meeting, San Francisco, CA. 

17. S. T. Sharfstein, S. J. Van Dien and J. D. Keasling. 1994. "Engineering and phosphate 
and energy metabolism in micro-organisms." AIChE National Meeting, San Francisco, 
CA. 

18. G. A. Hupf, N. Shapiro and J. D. Keasling. 1994. "Manipulation of phosphate and 
energy metabolism to improve heavy metal resistance and uptake." AIChE National 
Meeting, San Francisco, CA. 

19. J. Pramanik and J. D. Keasling. 1994. "Mathematical analysis of fluxes through the 
metabolic pathways of Escherichia coli." AIChE National Meeting, San Francisco, CA. 

20. R. Pape, P. Jorjani, and J. D. Keasling. 1994. "Design and construction of low-copy 
plasmids for metabolic engineering of Escherichia coli.'''' AIChE National Meeting, San 
Francisco, CA. 

21 . D. Bolesch and J. D. Keasling. 1994. "Anaerobic bioremediation of chlorinated alkanes." 
AIChE National Meeting, San Francisco, CA. 

22. D. Bolesch and J. D. Keasling. 1995. "Anaerobic bioremediation of chlorinated 
hydrocarbons." In Situ and On-Site Bioreclamation, San Diego, CA. 

23. G. Hupf and J. D. Keasling. 1995. "Manipulation of phosphate and energy metabolism 
to improve heavy metal resistance and uptake." In Situ and On-Site Bioreclamation, San 
Diego, CA. 
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24. J. D. Keasling, S. Van Dien, S. Keyhani, S. Sharfstein. 1995. "Engineering 
polyphosphate metabolism in bacteria." Biochemical Engineering VIII, Davos, 
Switzerland. 

25. P. C. Michels, J. A. Baross, J. D. Keasling, and D. S. Clark. 1995. "Bioremediation 
potential of newly isolated, metal-tolerant archaea." Biochemical Engineering VIII, 
Davos, Switzerland. 

26. J. D. Keasling, S. Van Dien, S. Keyhani, D. Bolesch, and S. Sharfstein. 1995. 
"Redirection of phosphate and energy metabolism through polyphosphate pathways." 
AIChE National Meeting, Miami Beach, FL. 

27. J. D. Keasling, D. Szykowny, and J. Elmen. 1995. "Degradation of aromatic 
hydrocarbons under denitrifying conditions." AIChE National Meeting, Miami Beach, 
FL. 

28. R. Brent Nielsen and J. D. Keasling. 1996. "Anaerobic bioremediation of chlorinated 
hydrocarbons." Engineering Foundation meeting Bioremediation of Surface and 
Subsurface Contamination in Palm Coast, FL. 

29. Joacim Elmen, Dave Szykowny, and J. D. Keasling. 1996. "Degradation of aromatic 
hydrocarbons under denitrifying conditions." Engineering Foundation meeting 
Bioremediation of Surface and Subsurface Contamination in Palm Coast, FL. 

30. J. D. Keasling. 1996. "Metabolic engineering of polyphosphate metabolism in bacteria 
for phosphate and heavy metal bioremediation." Engineering Foundation meeting 
Bioremediation of Surface and Subsurface Contamination in Palm Coast, FL. 

3 1 . Jaya Pramanik and J. D. Keasling. 1996. "A flux-based model of metabolism: effect of 
biomass requirements and redirected pathways on central metabolism." 211th American 
Chemical Society National Meeting in New Orleans, LA. 

32. J. D. Keasling. 1996. "Metabolic engineering for bioremediation of inorganic 
pollutants" Metabolic Engineering, Danvers, MA. 

33. R. B. Nielsen and J. D. Keasling. 1996. "Kinetic parameter evaluation and modeling of 
the anaerobic conversion of trichloroethene to ethene." AIChE National Meeting, 
Chicago, IL. 

34. N. Eliashberg and J. D. Keasling. 1996. "Simulation of bacterial growth and substrate 
utilization in a polluted groundwater environment." AIChE National Meeting, Chicago, 
IL. 

35. J. Pramanik and J. D. Keasling. 1996. "A flux-based metabolic model for bacteria: study 
of metabolic regulation and its sensitivity to biomass composition." AIChE National 
Meeting, Chicago, IL. 

36. S. J. Van Dien and J. D. Keasling. 1996. "Engineering the polyphosphate levels in 
Escherichia coli and the effects on the phosphate-starvation response." AIChE National 
Meeting, Chicago, IL. 

37. J. Pramanik, P. L. Trelstad, and J. D. Keasling. 1996. "Analysis of bioremediation 
processes using a flux-based metabolic model." AIChE National Meeting, Chicago, IL. 

38. S. J. Van Dien and J. D. Keasling. 1997. "Engineering the polyphosphate levels in 
Escherichia coli: Effects of energy and phosphate starvation." ACS National Meeting, 
San Francisco, CA. 

39. R. B. Nielsen and J. D. Keasling. 1996. "Anaerobic biodegradation of chlorinated 
hydrocarbons by groundwater microorganisms." ACS National Meeting, San Francisco, 
CA. 

40. J. Pramanik, P. L. Trelstad, and J. D. Keasling. 1996. "Analysis of the metabolism of 
enhanced biological phosphorus removal using a fluxed-based metabolic model." ACS 
National Meeting, San Francisco, CA. 

41 . J. D. Keasling. 1997. "Trc situ bioremediation of chlorinated and aromatic hydrocarbons 
in groundwater: application of modern molecular and mathematical tools." Biochemical 
Engineering X, Kananaskis, Canada. 

42. J. D. Keasling. 1997. "Development of tools for the metabolic engineering of bacteria." 
Biochemical Engineering X, Kananaskis, Canada. 

43. J. D. Keasling, J. Pramanik, J. Benemann. 1997. "Metabolic engineering for hydrogen 
fermentations." Biohydrogen '97, Kona, Hawaii. 
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44. N. Eliashberg and J. D. Keasling. 1997. "Simulation of spacial heterogeneity 
development in a mutualistic mixed species biofilm." AIChE National Meeting, Los 
Angeles, CA. 

45. R. B. Nielsen and J. D. Keasling. 1997. "Kinetics of anaerobic biodegradation of 
chlorinated ethenes." AIChE National Meeting, Los Angeles, CA. 

46. T. A. Carrier and J. D. Keasling. 1997. "Mechanistic modelling of prokaryotic mRNA 
decay." AIChE National Meeting, Los Angeles, CA. 

47. S. J. Van Dien and J. D. Keasling. 1997. "Engineering polyphosphate metabolism in 
Escherichia coli." AIChE National Meeting, Los Angeles, CA. 

48. K. L. Jones and J. D. Keasling. 1997. "Construction, stability, and expression of low- 
copy vectors derived from the E. coli F plasmid." AIChE National Meeting, Los 
Angeles, CA. 

49. T. A. Carrier, K. L. Jones, and J. D. Keasling. 1997. "mRNA stability and plasmid copy 
number effects on gene expression from an inducible promoter system." AIChE National 
Meeting, Los Angeles, CA. 

50. R. B. Nielsen and J. D. Keasling. 1998. "Anaerobic degradation of PCE and TCE 
DNAPLs by groundwater microorganisms." Remediation of Chlorinated and 
Recalcitrant Compounds, Monterey, CA. 

51. E. Gilbert, A. Khlebnikov, W. Meyer-Ilse and J.D. Keasling. 1 998. "Use of soft X-ray 
microscopy for analysis of early stage biofilm formation." Microbial Ecology of 
Biofilms: Concepts, Tools and Applications. International Association on Water Quality 
(IAWQ), Lake Bluff, IL. 

52. K. L. Jones, T. A. Carrier, and J. D. Keasling. 1998. "Plasmid vehicles for long-term, 
variable gene expression in Escherichia coli.'''' AIChE National Meeting, Miami Beach, 
FL. 

53. P. L. Trelstad and J. D. Keasling. 1998. "Polyphosphate Metabolism in Acinetobacter 
calcoaceticus: Implications for Enhanced Biological Phosphorus Removal." AIChE 
National Meeting, Miami Beach, FL. 

54. R. Brent Nielsen and J. D. Keasling. 1998. "Anaerobic Dechlorination of PCE and TCE 
DNAPLs by Groundwater Microorganisms." AIChE National Meeting, Miami Beach, 
FL. 

55. C. Wang, A. M. Lum, S. C. Ozuna, D. S. Clark, and J. D. Keasling. 1999. Cadmium 
precipitation by Escherichia coli producing cysteine desulfhydrase." ACS National 
Meeting, Anaheim, CA. 

56. R. Brent Nielsen and J. D. Keasling, 1999. "Identification of organisms present in a TCE- 
degrading consortium." ACS National Meeting, Anaheim, CA. 

57. A. Khlebnikov, O. Risa, and J. D. Keasling. 1 999. "Gene expression in a decoupled 
autocatalytic system under control of inducible promoters." American Society for 
Microbiology General Meeting, Chicago, IL. 

58. E. Gilbert, A. Khlebnikov, and J. D. Keasling. 1999. "Dual-GFP labeling of cells in 
biofilms." American Society for Microbiology General Meeting, Chicago, IL. 

59. S-W. Bang, D. S. Clark, and J. D. Keasling. 1999. "Precipitation of heavy metals by 
expression of thiosulfate reductase." American Society for Microbiology General 
Meeting, Chicago, IL. 

60. C. Wang, S. C. Ozuna, D. S. Clark, and J. D. Keasling. 1999. "Metabolic engineering of 
microorganisms to precipitate cadmium wastes." AIChE National Meeting, Dallas, TX. 

61 . A. W. Walker and J. D. Keasling. 1999. "Metabolic engineering of bacteria for the 
environment: the controlled degradation of parathion." AIChE National Meeting, Dallas, 
TX. 

62. P. L. Trelstad, D. Hong, and J. D. Keasling. 1999. "Understanding of the metabolism of 
enhanced biological phosphorus removal." AIChE National Meeting, Dallas, TX. 

63. C. D. Smolke, T. A. Carrier, and J. D. Keasling. 1999. "Engineering single and multiple 
gene expression through mRNA stability control." AIChE National Meeting, Dallas, TX. 

64. S. Reichmuth, J. D. Keasling, and H.W.Blanch. 1999. "Biodesulfurization of 
dibenzothiophene in Escherichia coli is enhanced by expression of a Vibrio harveyi 
oxidoreductase gene." AIChE National Meeting, Dallas, TX. 
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65. S.W. Kim, K.L. Jones, and J. D. Keasling. 2000. "Expression of l-deoxy-D-xylulose-5- 
phosphate synthase in Escherichia coli Enhances Lycopene Production". American 
Society for Microbiology General Meeting, Los Angeles, CA. 

66. S. E Cowan, E. S. Gilbert, A. Khlebnikov and J. D. Keasling. 1999. "Dual labeling with 
green fluorescent proteins for confocal microscopy." IAWQ/IWA Conference on Biofilm 
Systems, International Association on Water Quality, New York, NY. 

67. K. D. McMahon, M. A. Dojka, N. R. Pace, J. D. Keasling, and D. Jenkins. 1999. 
"Microbial Community Structure of Laboratory Activated Sludge Performing Enhanced 
Biological Phosphorus Removal." American Society for Microbiology General Meeting. 
Chicago, IL. 

68. E. S. Gilbert and J. D. Keasling. 2000. "Degradation of parathion by a dual-species 
biofilm consortium." American Society for Microbiology General Meeting. Los Angeles, 
CA. 

69. A. Khlebnikov, T. Skaug and J. D. Keasling. 2000. "Elimination of all-or-none gene 
expression by independent expression of the arabinose transport gene." American Society 
for Microbiology General Meeting, Los Angeles, CA. 

70. C. D. Smolke and J. D. Keasling. 2000. "Coordinated, differential expression of 
multiple genes through directed mRNA cleavage and stabilization by secondary 
structures." American Society for Microbiology General Meeting, Los Angeles, CA. 

71. I. Aldor and J. D. Keasling. 2000. "Metabolic engineering of poly(3-hydroxybutyrate-co- 
3-hydroxyvalerate) production in recombinant Salmonella typhimurium." American 
Chemical Society National Meeting, San Francisco, CA. 

72. E. S. Gilbert and J. D. Keasling. 2000. "Degradation of parathion by a dual-species 
biofilm consortium." American Chemical Society National Meeting. San Francisco, CA. 

73. A. Khlebnikov, T. Skaug and J. D. Keasling. 2000. "A regulatable arabinose-inducible 
gene expression system with consistent control in all cells of a culture." American 
Chemical Society National Meeting, San Francisco, CA. 

74. E. S. Gilbert and J. D. Keasling. 2000. "Degradation of parathion by a dual-species 
biofilm consortium." Biofilms 2000, American Society of Microbiology, Big Sky, MT. 

75. C. D. Smolke and J. D. Keasling. 2000. "Engineering mRNA stabilizing elements to 
achieve coordinated, differential expression of two genes." FASEB Summer Conference 
in Post-Transcriptional Control of Gene Expression, Copper Mountain, CO. 

76. I. Aldor and J. D. Keasling. 2000. "Metabolic engineering of poly(3-hydroxybutyrate- 
co-3-hydroxyvalerate) production in recombinant Salmonella typhimurium." 
International Symposium on Biological Polyesters, Cambridge, MA. 

77. K. D. McMahon, N. R. Pace, J. D. Keasling, and D. Jenkins. 2000. "Microbial 
community structure of activated sludge performing enhanced biological phosphorus 
removal." California Water Environment Association Annual Conference, Sacramento, 
CA. 

78. C. D. Smolke and J. D. Keasling. 2000. "Engineering mRNA stabilizing /destabilizing 
elements to achieve coordinated differential expression of two genes." AIChE Annual 
Meeting, Los Angeles, CA. 

79. A. W. Walker, S. K. Tehara and J. D. Keasling. 2000. "Metabolic Engineering of 
Bacteria for the Environment: The Degradation of Parathion." American Institute of 
Chemical Engineers, Los Angeles, CA. 

80. D.S. Reichmuth, H.W. Blanch and J. D. Keasling. 2000. "Biodesulfurization of 
dibenzothiophene in Escherichia coli is enhanced by expression of a Vibrio harveyi 
Oxidoreductase Gene." California Catalysis Society Annual Meeting, Richmond, CA. 

81 . A. W. Walker, S. K. Tehara and J. D. Keasling. 2001 . "Metabolic Engineering of 
Bacteria for the Environment: The Degradation of Parathion and Paraoxon." 
Bioengineering XII, Sonoma, CA. 

82. S.K. Tehara and J.D. Keasling. 2001 . "Isolation of a Novel Phosphodiesterase for 
Biodegradation of Organophosphates." American Chemical Society, San Diego, CA. 

83. D. S. Reichmuth, J. L. Hittle, H. W. Blanch, and J. D. Keasling. 2001. "Metabolic 
Engineering of the Dibenzothiophene Biodesulfurization Process." Biochemical 
Engineering XII, Sonoma, CA. 
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84. G. Y. Wang and J. D. Keasling. 2001. "Isolation and characterization of two key 
regulatory genes involved in isoprenoid biosynthesis of Aspergillus nidulans." Twenty- 
First Fungal Genetics Conference, Pacific Grove, CA. 

85. N. L. Goeden, J. D. Keasling, and S. J. Muller. 2001 . "Bacterial Expression of a Self- 
Assembling Amphiphilic Protein Polymer." AICHE National Meeting, Reno, NV. 

86. N. L. Goeden, J. D. Keasling, and S. J. Muller. 2001 . "Bacterial expression of a poly(L- 
leucylglutamic acid) fusion protein for use in studying structure-property relationships of 
disordered copolymers. " ACS National Meeting, San Diego, CA. 

87. C. D. Smolke and J. D. Keasling. 2001 . "Effects of gene copy number and steady-state 
mRNA levels on the relative expression levels of two genes in a novel operon." 
American Chemical Society National Meeting, San Diego, CA. 

88. C. D. Smolke and J. D. Keasling. 2001. "Effects of gene copy number and steady-state 
mRNA levels on the relative expression levels of two genes in a novel operon." 
American Society for Microbiology General Meeting, Orlando, FL. 

89. V. J. J. Martin, Y. Yoshikuni, and J. D. Keasling. 2001. "A study of the in vivo synthesis 
of plant sesquiterpenes by Escherichia coli." Society for Industrial Microbiology Annual 
Meeting, St. Louis, Missouri. 

90. K. D. McMahon, D. Jenkins, J. D. Keasling. 2001. "Polyphosphate kinase genes from 
activated sludge carrying out enhanced biological phosphorus removal." Water 
Environment Federation 74th Annual Conference and Exposition (WEFTEC), Atlanta, 
GA. 

91. K. D. McMahon, J. D. Keasling, D. Jenkins. 2001. "Polyphosphate kinase genes from 
activated sludge carrying out enhanced biological phosphorus removal." International 
Association for Water Quality 3rd International Specialized Conference on 
Microorganisms in Activated Sludge and Biofilm Processes. Rome, Italy. 

92. K. D. McMahon, D. Jenkins, J. D. Keasling. 2001. "Polyphosphate kinase genes from 
activated sludge carrying out enhanced biological phosphorus removal." 101st General 
Meeting of the American Society for Microbiology, Orlando, FL. 

93. C. D. Smolke and J. D. Keasling. 2001. "Effects of gene copy number and steady-state 
mRNA levels on the relative expression levels of two genes in a novel operon." 
Biochemical Engineering XII, Rohnert Park, CA. 

94. C. D. Smolke, B. Pfleger, and J. D. Keasling, J. D. 2001 . "Rational and random design 
strategies for controlling heterologous protein production from novel operon systems in 
E. coli." American Institute of Chemical Engineers Annual Meeting, Reno, NV. 

95. N. L. Goeden, J. D. Keasling, and S. J. Muller. 2002. "Microbial Production of a Self- 
Assembling Amphiphilic Protein Polymer." American Chemical Society National 
Meeting, Orlando, FL. 

96. Brian Pfleger, Christina Smolke, and Jay Keasling. 2002. "Engineering mRNA 
Stability." Annual Meeting of the Society for Industrial Microbiology. Philadelphia, PA. 

97. G. Y. Wang and J. D. Keasling. 2002. "Metabolic engineering of isoprenoid production 
in Aspergillus nidulans.'''' Annual Meeting of the Society for Industrial Microbiology, 
Philadelphia, PA. 

98. G. Y. Wang, M. H. Chai, and J. D. Keasling. 2002. "Potential use of a novel 
geranylgeranyl diphosphate synthase gene from Aspergillus nidulans in metabolic 
engineering of isoprenoid production." American Society for Microbiology General 
Meeting, Salt Lake City, UT. 

99. G. Y. Wang, M. H. Chai, D. J. Pitera, and J. D. Keasling. 2002. "Functional 
characterization of genes involved in isoprenoid biosynthesis from Aspergillus nidulans." 
102nd General Meeting of the American Society for Microbiology, Salt Lake City, UT. 

100. S.K. Tehara and J.D. Keasling. 2002. "Purification and Characterization of a 
Phosphodiesterase from Delftia acidovorans." American Society for Microbiology, Salt 
Lake City, UT. 

101 . V. J. J. Martin, D. Pitera, and J. D. Keasling. 2002. "Metabolic engineering of isoprenoid 
biosynthesis." American Society for Microbiology General Meeting, Salt Lake City, 
Utah. 
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102. D. Pitera, V. J. J. Martin, and J. D. Keasling. 2002. "Isoprenoid biosynthesis: 
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I, Bernhard O. Palsson, declare as follows: 

1 . I am a Professor in the Department of Bioengineering at the University of 
California, San Diego (UCSD), San Diego, California. I am also an Adjunct Professor of 
Medicine at UCSD. I have held the former positions since joining UCSD in 1995, and the latter 
since 1998. I previously served on the faculty at the University of Michigan from 1984-1995. 

2. I obtained a Bachelors of Science majoring in chemical engineering in 1979 from 
the University of Kansas, and earned my Ph.D. from the University of Wisconsin-Madison in 
Chemical Engineering in 1 984. A copy of my curriculum vitae and a list of publications is 
attached as Exhibit 1 . 

3. I am an inventor or co-inventor on at least 35 U.S. Patents and a founder or co- 
founder of four life science companies. I co-founded Aastrom Biosciences, a public company 
that focuses on process technologies and devices for cell therapy applications, in 1989 and 
served as the Vice President of Developmental Research 1994-1995. I am the founder of 
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Oncosis, a company focused on the purging of occult tumor cells in autologous bone marrow 
renamed Cyntellect and is focused on building research instrumentation for cell biology. 

4. I also am a founder of Genomatica, Inc., which was incorporated in 1998 and is 
the exclusive licensee of the above-identified application. From 1998 to 2002 I served as the 
Company's Chief Executive Officer. Currently, I am the chair of Genomatica's Scientific 
Advisory Board. 

5. I am named as the sole inventor on the above-identified patent application, U.S. 
patent application serial no. 09/923,870. 

6. I have read the Office Action mailed December 1 8, 2008, and am very familiar 
with the prosecution history of this application. I understand that the claimed invention stands 
rejected for obviousness over the combination of references to Pramanik et al., Biotech, and 
Bioengineering 56:398-421 (1997) in view of Blattner et al., Science 277:1453-69 (1997) and in 
view of Kunst et al., Rev. in Microbiol. 142:905-12 (1991). The Examiner appears to equate the 
metabolic model described in Pramanik et al. with that described in the subject application and 
then combines it with two genome sequencing papers (Blattner et al. and Kunst et al. reporting 
the sequencing of E. coli and B. subtilis genomes, respectively) to conclude obviousness. 

7. It is my opinion that the claimed invention is not obvious over Pramanik et al., 
Blattner et al. and Kunst et al. because (1) the use of genomic data would not have been expected 
to produce a predictable model, and (2) the opinions of respected colleagues in the field of 
metabolic modeling sincerely doubted that such a model would work, even after its publication. 

8. One skilled in the art would not have been motivated to use genomic data to 
produce a stoichiometric matrix because the predictability of an in silico model relies on 
accuracy of the input data. Genome annotation is both incomplete and contains possible errors in 
the functional annotation of genes. Organism metabolism and reconstructing their metabolic 
networks from experimental data is scientifically very complex. At the time the invention was 
made, the metabolic network of a simple organism was known to have hundreds of reactions, 
substrates, products, co-factors and regulatory interventions and is dependent on many other 
events occurring within a cell. The predictability of a metabolic model depended on the 
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accuracy of this complex set of reactions and events. Therefore, metabolic models prior to my 
discovery employed either known kinetic or biochemical data of known reactions or events in 
order to ensure the most accurate model as possible. 

9. It was incomprehensible to those in the field of metabolic modeling and metabolic 
engineering to incorporate reactions based on putative gene sequence homology without 
knowledge that the reaction existed in the modeled organism and without any known kinetic or 
biochemical information because incorrect deductions would decrease, rather than increase, the 
predictive capability of the resultant metabolic model. Therefore, those in the field would have 
discouraged incorporating putative metabolic reactions into an in silico model based on gene 
sequence homology to different organism because it could have led to incorporating erroneous 
reactions and a loss of a model's accuracy. 

10. The unexpected nature of my discovery that genomic data can be incorporated 
into a model to construct a stoichiometric matrix without other knowledge is borne out by the 
disbelief of respected colleagues in the field at the time the invention was made. In addition to 
the concerns about the accuracy of the genomic data, my colleagues doubted that a metabolic 
reconstruction containing information about only 15% of the identified genes in E. coli and no 
regulatory information would be able to lead to meaningful computations about physiological 
functions. 

1 1 . Professor James E. Bailey, a respected and prestigious scientist in the field of 
metabolic engineering, is one such colleague who, on the one hand, found the success of my 
approach hard believe, but on the other hand if the model worked as purported, characterized my 
genomic-flux balance approach a "very major advance," a "breakthrough." 

12. Professor Bailey passed away on May 9, 2001 . At the time of my invention, 
Professor Bailey was a professor in the Institute of Biotechnology, ETH Zurich, Swiss Federal 
Institute of Technology in Zurich, Switzerland. Professor Bailey held a great deal of stature and 
respect in the field of metabolic engineering. Among Professor Bailey's many accomplishments 
include the foundational paper on Metabolic Engineering ('Towards a Science of Metabolic 
Engineering' Science 252:1688-1675, 1991, attached as Exhibit 2 ) and the publication of the 
standard text book in biochemical engineering (J.E. Bailey and D.F. Ollis, BIOCHEMCIAL 



3 



09/923,870 



ENGINEERING FUNDAMENTALS, 2 nd edition 1986, 984 pages total, McGraw-Hill, New 
York). Publications in memory of Professor Bailey summarizing his accomplishments, 
scientific career and stature in the field of metabolic engineering are attached as Exhibits 3 and 4 . 

13. Attached as Exhibit 5 is a letter from Professor Bailey to me dated July 8, 1999, 
commenting on my publication describing the robustness of an E. coli metabolic model that used 
genomic data in its construction. A copy of the publication that precipitated Professor Bailey's 
letter is attached as Exhibit 6 (Edwards and Palsson, Biotechnol. Prog. 16:927-39 (2000). 

14. With some tenacity Professor Bailey questions the operability of a genomic- flux 
balance model and the case studies reported in Exhibit 4 , finding them "hard[] ... to 
understand" (paragraph 2) and raising a number of questions regarding how the genomic model 
arrives at rates from stoichiometry (paragraph 3). Professor Bailey also questions the inclusion 
of 47 enzymes in addition to those implied by the genome sequences (paragraph 4). These steps 
were added based on biochemical information alone. 

15. In the concluding paragraph Professor Bailey shows his very real skepticism and 
reservation by explicitly questioning whether the genomic model is correct or whether it works. 
Conversely, if the model operates as described Professor Bailey characterizes the discovery as a 
breakthrough. This skepticism and acknowledgement that the discovery is a breakthrough in the 
field of metabolic engineering are clear when Professor Bailey states: 

This method, if it is correct and if it works, is a very major advance , because it 
give a formalism for determining growth rates and pathway rates without any 
knowledge of any kinetics. I must say it is a little hard to believe that this can be 
done, but maybe you have made a breakthrough . Id. (emphasis added). 

As an opinion leader in the field, it is fair to say that Professor Bailey's skepticism characterized 
the attitudes of my colleagues at the time. 

1 6. Since my initial publication of a genomic-flux balance metabolic model and the 
above interaction with Professor Bailey there have now been probably more than 100 subsequent 
publications from my laboratory and from the work of others in the field documenting the 
operability and predictive power of this type of metabolic model. This discovery was 
unexpected because we found that we could incorporate metabolic reactions based on genomic 
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deductions and the model was able to accurately predict metabolic capabilities of the modeled 
organism. As would be characterized by Professor Bailey now that there are several hundred 
subsequent publications to quell his initial skepticism, this discovery was a very major advance 
or breakthrough in the field. For example, for E. coli alone, about 70 such scientific studies were 
reviewed in June 2008 (Nature Biotechnology, 26: 659-667, 2008), attached as Exhibit 7 . 
Notable predictions by the genomically-based reconstruction were the computation of the effects 
of gene knock-outs (Edwards and Palsson, Proc. Nat'lAcad. Sci. USA, 97:5528-33, (2000) 
( Exhibit 8) : Covert et al., Nature, 429:92-96 (2004) ( Exhibit 9) ) and the outcomes of adaptive 
evolution (Ibarra et al., Nature, 420:186-189 (2002) ( Exhibit 10) ). These predictions were 
unexpected and widely noticed by the scientific community. 

I hereby declare that all statements made herein of my own knowledge are true and that 
all statements made on information and belief are believed to be true; and further that these 
statements are made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that any such willful false statement may jeopardize the validity of the application or 
any patent issued thereon. 
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20. N.D. Price, J.A. Papin, and B.O. Palsson, "In silico cells: studying genotype-phenotype 
relationships using constraints-based models," Metabolic Engineering in the Post-Genomic 
Era, 2003, Horizon Scientific Press. 

21. Joyce A.R. and Palsson B.O. (2006) Toward whole cell modeling and simulation: 
Comprehensive functional genomics through the constraint-based approach, in Systems 
Biological Approaches in Infectious Diseases, 64:265-311, H.I. Boshoff and C.E. Barry, Eds 
(2006). 

22. Herrgard MJ and Palsson BO., Genome-Scale Models of Metabolic and Regulatory Networks, 
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EDITORIAL ACTIVITIES 



1. Special Editor, with Jeffrey A. Hubbell and E.T. Papoutsakis, of Tissue Engineering & Cell 
Therapies: I and II, special issues of Biotechnology and Bioengineering, 43:7 and 43:8, March 
25 and April 5, 1994, John Wiley & Sons, Inc., Publishers. 

2. Editorial Board, Tissue Engineering, Charles A. Vacanti and Antonios G. Mikos, editors, Mary 
Ann Liebert, Inc., Publishers. (1994 to 2000) 

3. Board of Associate Editors, Mathematical Problems in Engineering: Problems, Theories, and 
Applications, V. Lakshmikantham and S.M. Meerkov, Editors, Gordon & Breach Publishing 
Co. (1994 to 2000) 

4. Editorial Board, Annals of Biomedical Engineering, James B. Bassingthwaighte and Daniel A. 
Hammer, Editors (1995 to present) 

5. Section editor for Tissue Engineering, with Jeffery A. Hubbell, The Biomedical Engineering 
Handbook, Editor-in-Chief Joseph D. Bronzino, CRC Press, Boca Raton (1995). 

6. Editorial Board, Biotechnology and Bioengineering, Douglas Clark Editor-in-Chief, Wiley and 
Sons (1996-present) 

7. Editorial Board, Metabolic Engineering, G. Stephanopoulos, M. Yarmush, and A. Sinskey, 
Editors, Academic Press. 

8. Advisory Editorial Board, Nature Molecular Systems Biology, (2004-Present). 

9. Editorial Board, Journal of Bacteriology, (2004-Present). 

1 0. Editorial Board, Journal of Biological Chemistry (2007-Present). 



PROPOSALS ADMINISTERED 



TITLE 


SOURCE 


AMOUNT 


CO-PI/PERIOD 


Mathematical Modeling of Anaerobic 
Digestion Dynamics 


Michigan Biotechnology 
Institute 


$22,500 


18 months, (1985). 


Use of Kinetic Models to Improve Blood 
Storage 


Whitaker Foundation 


$82,532 


Nov 1985 -Oct 1987 


Cell Culture Facility 


University of Michigan NiH 
Biomedical Research 
Support Grant 


$59,500 


Professors J. S. Schultz 
and H. Y. Wang (1986) 


Cell Culture Facility 


NSF equipment grant 


$21,300 


H.Y. Wang Co-PI (1986) 


Purchase of a Spectrofluorometer 


University of Michigan NIH 
Biomedical Research 
Support Grant 


$14,000 


Professor J. S. Schultz 
(1987) 


Cellular Bioengineering 


From the Presidential 
Initiatives Fund at the 
University of Michigan 


$90,000/Yr. 


Prof. M. Savageau (PI), 
with Profs M.E. Meyerhoff 
and A. R. Midgley, June 
1987 -May 1990 


Metabolic Dynamics in the Red Cell 


FIRST award from NIH 


$90,000/Yr. 


Sept 1987 -Aug 1992 


Efficient Monoclonal Antibody 
Production 


NSF Biotechnology Cluster 
Grant 


$803,520 


Professors M. A. Savageau 
and A. R. Midgley, Sept 
1987 -Sept 1990, 


Life Support Systems via the Use of 
Cell Culture 


CAMRSS, a NASA funded 
CCDS center 


$150,000 


Prof. P. Kaufman and Dr. 
T. Huard, Feb 1988- Jan 
1989 


Construction of a High-efficiency Ex 
vivo Bone Marrow 


NSF Tissue Engineering 
Initiative 


$50,000 


Drs. S. Emerson (PI) and 
M. Clarke in Internal 
Medicine, Sept 88 to Aug 


Gravitropic Response Mechanism in 
Cereal Grass Shoots 


NASA 


$55,000 


Prof. P. Kaufman (PI) 2/89- 
6/89 


Development of an Ex vivo Bone 
Marrow System 


Hambrecht & Quist 


$300,000 


S.G Emerson (PI) and M. 
Clarke 4/89- 10/89 


Development of a Photo-bioreactor and 
Green Plant Cell Lines for CELSS 


CAMRSS, a NASA funded 
CCDS Center 


$250,000 


Professor P. Kaufman, Feb 
1989 -Sept 1989 


Development of an Ex vivo Bone 
Marrow System 


Ann Arbor Stromal Inc., 


$900,000 


S. Emerson (PI) and M. 
Clarke 8/89-4/92 


Development of an Algal Photo- 


CAMRSS, a NASA-funded 


$120,000 


Nov. 1989-Sept 1990. 
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bioreactor 


CCDS center 






Establishment of a Digital Imaging 
Facility 


BSRG/NIH 


$27,000 


Prof. J.J. Linderman (PI) 
April 1990 -March 1991. 


Efficient Monoclonal Antibody 


NSF Biotechnology Cluster 
Grant 


$664 098 


Prof. M. E. Meyerhoff, Nov 
1990 -Oct 1993 


Photo-bioreactor Engineering: Light 
Delivery and Algal Growth and Gas 
Exchange Rates 


NASA/Headquarters 


$960,713 


Prof. T. L. Killeen, Jan, 
1991 to Jan, 1994. 


Development of an ex vivo Retroviral 
Infection System 




$30 000 


Post-doctoral Fellowship 
1/14/91 - 10/13/92. 


Development of a Hematopoietic 
Bioreactor System 


Aastrom Biosciences Inc. 


$45,000 


Post-doctoral Fellowship 
5/1/92- 10/31/93. 


Ex vivo Growth and Manipulation of 
Human Hematopoietic Cells - 
Hematopoietic Cell Expansion 
Bioreactor Design and Retroviral Gene 
Transfer 


Aastrom Biosciences Inc. 


$532,259 


3/1/92-6/30/93. 


Shear Sensitivities of Human Bone 
Marrow Cultures 




$315,000 


12/1/92-12/1/93. 


Hematopoietic Bioreactor Design with 
Applications to Gene Therapy 


Aastrom Biosciences, Inc. 


$344,001 


7/1/93 - 12/31/94. 


Biological Determinants of 
Photobioreactor Design 


Department of Energy 


$99,898 


9/1/93-8/31/95. 


Hemtaopoietic Cell Expansion System 


NIH SBIR Phase II Award 


$520,000 


3/1/95-2/29/96. 


Biochemical Engineering: Genomatics 
and Whole Cell Simulators 


UC Biotechnology Program 


$80,000 


7/1/96-6/30/98 


Shear Sensitivities of Human Bone 
Marrow Cultures 


NASA Biotechnology 
Program 


$190 000 


1 0/1/95-1 2/1 /97 


Stem Cell Motility 


The Stern Foundation 


$50,000 


11/1/97-10/30/98 


Hematopoietic Stem Cell Motility 


National Institutes of Health, 
R01 HL59234 


$152,207 
annual direct 
costs 


Jan 1, 1998 to Dec 31 2001 


Genomically Based Models for 


National Institutes of Health, 
R01 GM57089 


$123,000 
annual direct 
costs 


7/1/98-6/30/01 


Mechanisms of Stem Cell Migration 


National Institutes of Health, 
R01 HL 60398 


$200,000 
annual direct 


7/10/98 to 7/9/02 


In Silico Analysis of the Escherichia 
Coli Metabolic Genotype and the 
Construction of Selected Isogenic 
Strains 


National Science 
Defense, BES-98 14092 


$88 388/Yr 




Computational Infrastructure for 
Engineering Microorganisms 


National Science 
Foundation/KDI, SBR- 
9873384 


$297,218 


10/1/98-9/30/00 


Genome-scale in silico Model for E. coli 


National Institutes of Health, 
R01 GM057089 


$250,000 


8/1/98-4/30/11 


Tissue Engineering 


Whitaker 

Foundation/Teaching 
Materials Project 


$89,962 


11/1/99-1/1/03 


Kinetic and Regulatory Constraints 
on Metabolism 


National Science 
Foundation/BES-01 20363 


$469,934 


9/1/01-8/31/03 


Growth 


RoTGMe^gi' 68 


$432,254 
annual direct 
costs 


4/01/01-5/31/10 


Antibiotic^ 


R01 GM57089 


$250,000 
annual direct 
costs 


4/1/03-3/31/08 


Systems Biology and Bioengineering 


Whitaker 

Foundation/Teaching 
Materials Project 


$84,713 


4/1/03-11/30/04 


Network Based Analysis of Kinetics and 
Regulation 


National Institutes of 
Health/R01 GM68837 


$225,000 
annual direct 
costs 


7/15/03-6/30/07 


Reconstruction and Simulation of 


National Science 


$432,697 


10/1/03-9/30/06 
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Genome-Scale Regulatory Networks 


Foundation/BES-0331 342 






A Genome-Scale Regulated Metabolic 
Model of Yeast 


National Institutes of Health, 
R01 GM071808 


$446,123 
annual direct 
costs 


7/01/04-6/30/09 


Analysis of the Genetic Potential and 
Gene Expression of Microbial 
Communities Involved in the in situ 
Bioremediation of Uranium and 
Harvesting Electrical Energy from 
Organic Matter 


DOE, 03-003721 G 00 


$254,750 


9/1/06-1/31/09 


Systems-Level Understanding of 
Hydrogen Production by Thermotoga 
maritima. 


DOE, DE-PS02-08ER08-12 


$506,489 


9/15/08-9/14/11 


A Systems Biology Program to Study 
Infectious Microorganisms in Taiwan 


NHRI (Taiwan) 


$1,052,904 


7/1/08-8/31/11 



INVITED TALKS PRESENTED AT MEETINGS: 

1 . B. O. Palsson "Making Mathematical Descriptions of Metabolic Reaction Networks Manageable" 
Presented at the 3rd Henry Goldberg Workshop on "Simulation and Modeling of the Cardiac 
System: from Cellular Activation to Muscular Activity", March 31 - April 2, 1986, Rutgers 
University, New Brunswick, NJ. 

2. O. Palsson, A. Joshi, and S. S. Ozturk, "Reducing Complexity in Metabolic Networks: Making 
Metabolic Meshes Manageable", FASEB meeting, St. Louis, MO, April 14-18, 1986. 

3. B. O. Palsson, l-der Lee, and A. Joshi, "A Comprehensive Computer Model of Human Red Cell 
Metabolism, " International Symposium on Mathematical Models of Cellular Processes, Holzhau, 
GDR, November 19-23, 1989 

4. B. O. Palsson, S. Emerson, M. Clarke, R. Schwartz, and J. Caldwell, "Reconstitution of a 
Functioning Bone Marrow," The 1989 International Chemical Congress of Pacific Basin Societies, 
Symposium on Cell Culturing, Honolulu, Hawaii, December 17-22, 1989 

5. B. O. Palsson, "Metabolic Modeling, Design and Operation of Continuous Perfusion 
Hematopoietic Cultures," UCLA Symposium on Tissue Engineering, Keystone CO, April 5-12, 
1990 

6. Minoo Javanmardian and B. O. Palsson, "Can Biotechnology Help with Global Warming?," 
American Chemical Society 200th National Meeting, Washington, D.C., August 26-31, 1990 

7. J. D Keasling, B. O. Palsson, and S. Cooper, "Cell-cycle-specific F' lac Plasmid Replication: 
Regulation by Cell Size Control of Initiation, " European Molecular Biology Organization Meeting on 
the Bacterial Cell Cycle, Collouges-LaRouge, France, September 30 - October 5, 1990. 

8. B. O. Palsson, "Hematopoietic Bioreactor Systems," Engineering Foundation Conference - Cell 
Culture Engineering III: Tissue Engineering, Palm Coast, Florida, February 2 - 7, 1992. 

9. A. Peng and C. G. Lee, "Two Novel Bioreactor Systems for the Cultivation of Human Bone 
Marrow and for Growing Algae Photoautotrophically, " SIM/CSM Joint Meeting - Symposia on 
Novel Bioreactors, Toronto, Canada, August 1 - 6, 1993. 

10. B. O. Palsson, "Hematopoietic Tissue Engineering: From Basic Principles to Clinical Practice," 
IFMBE 1st International Conference on Cellular Engineering, Stoke-on-Trent, UK, September 12 - 
15, 1993. 

11. B. O. Palsson, "Cell and Tissue Engineering: An Emerging Discipline?" N + N Meeting on Cellular 
Engineering, USA and UK, Chester, UK, September 15-17, 1993. 

12. B. O. Palsson, "Tissue Engineering and Differentiating Cell Types," Frontiers in Bioprocessing III, 
Boulder, CO, September 19-23, 1993. 

13. B. O. Palsson, "Shear Sensitivities of Human Bone Marrow Culture," Investigators' Meeting, 
NASA/Johnson Space Center, Houston, TX October 15-16, 1993. 

14. C-G. Lee and B.O. Palsson, "Design and Performance of an LED-Based Algal Photobioreactor," 
International Winter Meeting of the American Society of Agricultural Engineers (ASAE) Chicago, 
IL, December 14-17, 1993. 
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15. B. O. Palsson, "Hematopoietic Perfusion Bioreactors: Scientific and Clinical Utility," Keystone 
Symposia on Molecular and Cellular Biology: Tissue Engineering, Taos, New Mexico, February 
20-26, 1994 

16. B. O. Palsson, "Expansion of Progenitor Cells," The Seventh International Symposium on 
Autologous Bone Marrow Transplantation, Arlington, TX, August 17-20, 1994. 

17. B. O. Palsson, "Hematopoietic Tissue Engineering," ESACT/JAACT Meeting Animal Cell 
Technology 'Developments towards the 21st Century,' Veldhoven, The Netherlands, September 
12-16, 1994. 

18. B. O. Palsson, "Perfusion-Based Bioreactors for the Expansion of Human Bone Marrow, Stem 
and Progenitor Cells," BPEC Symposium and Workshop on Bioprocessing Needs for Cell Based 
Therapies, MIT, Cambridge, MA, January 18-19, 1995. 

19. B. O. Palsson, "Tissue Engineering of Bone Marrow," National Engineering Forum, UC San Diego 
School of Engineering, San Diego, May 4-5, 1995. 

20. B. O. Palsson, M.R. Kollerand J. Maluta, "Hematopoietic Tissue Engineering Challenges in 
Producing Clinical Scale Cell Populations," Engineering Foundation Biochemical Engineering IX: 
Interdisciplinary Foundations for Creating New Biotechnology, Davos, Switzerland, May 21-26, 
1995. 

21. B. O. Palsson, M.R. Kollerand J. Maluta, "Hematopoietic Tissue Engineering Challenges in 
Producing Clinical Scale Cell Populations," 2nd International Meeting on Cell Engineering, San 
Diego, August 1995 

22. B. O. Palsson and A. Varma, "Metabolic Flux Balancing: basic concepts, scientific and practical 
use," Recent Advances in Fermentation Technology, San Diego, November 4-7 1995 

23. BO Palsson, J. Maluta, CA Peng, RD Armstrong and MR Koller, "Scientific and Clinical 
Applications of and Ex vivo Hematopoietic Model," Tissue Engineering, Keystone Symposia, 
Taos, NM Jan 1996 

24. BO Palsson, AS Chuck and G Huang, "Gene Therapy: The Importance of Random Brownian 
Motion," Cell Culture Engineering V, Engineering Foundation Meetings, San Diego, Jan 1996 

25. BO Palsson, AS Chuck and G Huang, "Retroviral Infection is Limited by Random Brownian 
Motion," Gene Therapy for Hematopoietic Stem Cells in Genetic Disease and Cancer, Keystone 
Symposia, Taos, NM Feb 1996 

26. BO Palsson, "Retrovirally-Mediated Gene Transfer is Limited by Random Brownian Motion," 4th 
International Symposium on Recent Advances in Hematopoietic Stem Cell Transplantation: 
Clinical Progress, New Technology, and Gene Therapy. UCSD/ISHAGE, San Diego, CA, April 
11-13, 1996' 

27. BO Palsson, "How Should We Approach the 'Engineering' of Metabolic Function," Recombinant 
DNA Biotechnology: Focus on Metabolic Engineering, Engineering Foundation Conferences, 
Ferncroft Conference Resort, Danvers, MA, October 6-11, 1996 

28. BO Palsson, "Modeling Challenges in Tissue Engineering and Complex Systems," Modeling in 
Biochemical Engineering, October 11-12, 1996 at the University of Minnesota, Minneapolis/St. 
Paul, MN. 

29. BO Palsson, "Genetic Circuits," Chemical Engineering and Living Systems, in Honor of Professor 
EN Lightfoot, Nov 9-10, 1996, Madison, Wl 

30. J Edwards, R Ramakrishna and BO Palsson, "Significant Flexibility Exists in the Core Metabolic 
Pathways," Soc. for Industrial Microbiology Annual Meeting, Reno, NV, Aug 3-7th, 1997 

31. B.O. Palsson "What lies Beyond Bioinformatics?," National Academy of Engineering, Annual 
Meeting, October 6 to 8, 1997, Washington DC 

32. B. O. Palsson "The Importance of Stem Cells in Tissue Engineering: lessons learned from 
hematopoiesis," Tissue Engineering, Keystone Symposia, Copper Mountain, Jan 1998 

33. B. O. Palsson "The Importance of Stem Cells in Tissue Engineering: lessons learned from 
hematopoiesis," Cell Culture Engineering VI, Engineering Foundation Conference, Pacific Beach, 
February 8-14, 1998 

34. BO Palsson, "What Lies Beyond Bioinformatics?," Bioinformatics Workshop, Sat. April 4, 1998 
San Diego Supercomputer Center Auditorium 

35. BO. Palsson, "Mechanisms of Stem Cell Migration," 6th International Symposium on Recent 
Advances in Hematopoietic Stem Cell Transplantation," April 16-18 San Diego, 1998 

36. BO. Palsson, "Bioinformatics-Their Role in Modern Scientific Investigation and Cardiac Disease," 
6th Antwerp-La Jolla-Kyoto Research Conference on Cardiac Function, La Jolla, April 25-27, 1998 
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37. BO. Palsson, "Towards Metabolic Phenomics: analysis of genomic data using flux balances," 
Metabolic Engineering II, Elmau, Germany, Oct 25-30, 1998 

38. B.O. Palsson, "Haemophilus influenzae Metabolic Genotype; Its definition, Systems, 
Characteristics, and Capabilities," AlChE Annual Meeting, Miami Beach, FL November, 1998. 

39. B.O. Palsson, "Synthesizing and Characterizing In Silico Bacterial Strains," In Silico Biology 
Meeting, San Francisco, CA, June 1999. 

40. B.O. Palsson, "Building Metabolic Models from Annotated Genes," Biochemical Engineering XI: 
Molecular Diversity In Discovery and Bioprocessing, Park City, UT, July 1999. 

41 . B.O. Palsson, "Life on the Edge: using genome-scale in silico models of microorganisms to 
interpret and predict metabolic phenotypes," 11th International Genome Sequencing and Analysis 
Conference, Miami, FL, September 1999. 

42. B.O. Palsson, "What Lies Beyond Bioinformatics?," BMES/EMBS Joint Meeting, Atlanta, GA, 
October 1999. 

43. B.O. Palsson, J.S. Edwards, and R.I. Ibarra, "The generation of experimentally testable 
hypotheses in silico using an Escherichia coli metabolic model, 4 th Annual Hilton Head Workshop, 
Hilton Head, SC, February, 2000. 

44. B.O. Palsson, "Reconstruction of Metabolic Networks in silico and Formulation of Testable 
Experimental Hypotheses," Bioinformatics 2000, Elsinore, Denmark, April, 2000. 

45. B.O. Palsson, "Living on the edge: E. coli optimizes its growth within governing physico-chemical 
constraints," In Silico Biology Conference, San Francisco, June, 2000. 

46. B.O. Palsson, "Functional Genomics," The Whitaker Foundation Biomedical Engineering 
Education Summit, Arlington, VA, December, 2000 

47. B.O. Palsson, "Models of microorganisms to interpret and predict metabolic phenotypes: basic 
concepts, scientific and applied uses," NASCRE 1 Conference, Houston, TX January, 2001. 

48. B.O. Palsson, "Single and multi cellular communication pathways and their transduction," 
DOE/NSF Workshop on Biological Information Processing and Systems," Greenville, SC, 
January, 2001. 

49. B.O. Palsson, "The challenges of in silico biology," Workshop on Challenges and Opportunities in 
Data Management, Palo Alto, CA, March, 2001 . 

50. B.O. Palsson, " Life on the edge: Using genome-scale in silico models of microorganisms to 
interpret and predict metabolic phenotypes," Recovery of Biological Products 10, Cancun, Mexico, 
June, 2001 

51. B.O. Palsson, "Influence of Bioinformatics on Metabolic Engineering," CAB8 Conference, Quebec, 
Canada, June, 2001. 

52. B.O. Palsson, "Do in silico models of metabolism represent their in vivo counterparts well?" 
Beyond Genome 2001, San Francisco, June, 2001. 

53. B.O. Palsson, "The Phase transition from in vivo to in silico biology," The Bayer Lectures in 
Chemical Engineering, Berkeley, September, 2001. 

54. B.O. Palsson, "Life on the edge: using genome scale in silico models of microorganisms to 
interpret and predict metabolic phenotypes," DOE 9 th International Conference on Microbial 
Genomes, Gatlinburg, TN, October, 2001. 

55. B.O. Palsson, "Data-Driven Constraints-Based Models in Biology," AlChE Meeting, Reno, NV, 
November, 2001. 

56. B.O. Palsson, "The Phase transition from in vivo to in silico biology," Syngenta, December, 2001 

57. B.O. Palsson, "The Phase transition from in vivo to in silico biology," CHI Metabolomics Meeting, 
December, 2001 

58. B.O. Palsson, "The phase transition from in vivo to in silico biology," ETH, Zurich, Switzerland, 
January, 2002. 

59. B.O. Palsson, "The phase transition from in vivo to in silico biology," Novartis, Basel Switzerland, 
January, 2002. 

60. B.O. Palsson, "Data Driven Constraint-Based Models in Cell Biology," 6 th Annual Lake Tahoe 
Symposium, Granlibakken, Lake Tahoe, CA, January, 2002. 

61. B.O. Palsson, "From bioreactor design to genetic circuits, " Lindbergh-Carrel Symposium, 
Charleston, SC, February, 2002. 

62. B.O. Palsson, "Model-Centric Integrated Genomic Databases," Frontiers of Genomics, Madison, 
Wisconsin, May 16-17, 2002. 

63. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models" Cuernevaca, Mexico, May 
23-25, 2002. 
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64. B.O. Palsson, "The New Biotechnology Seminar Series," Campbell and Flores, LLP, June, 2002. 

65. B.O. Palsson, "Keynote Speaker," IEEE Computer Society Bioinformatics Conference, Stanford 
University, August 14-16, 2002. 

66. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," Engineering Foundation 
Conferences, Metabolic Engineering IV, Barga, Italy, October 6-1 1 , 2002. 

67. B.O. Palsson, FPB Division Award Talk, "The phase transition from in vivo to in silico biology," 
AlChE Meeting, Indianapolis, IN, November 3-5, 2002. 

68. B.O. Palsson, "Genome-Scale Models for Prospective Metabolic Engineering," CHI— Metabolic 
Engineering: Modifying Metabolic Pathways Conference, Raleigh, NC, December 2-3, 2002. 

69. B.O. Palsson, "Laser-Enabled High-Throughput High-Content Single-Cell Analysis," IBC— Cell 
Based Assay and Screening, Philadelphia, PA December 4-6, 2002. 

70. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," Harvard University 
Department of Genetics, December 10, 2002. 

71. B.O. Palsson, "Computer-assisted search for new antimicrobials," 2002 CSPA ANTI Special 
Session, Ft. Lauderdale, FL, December, 2003. 

72. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models, ICSB 2002, Stockholm, 
Sweden, December 13-15, 2002. 

73. B.O. Palsson, "A model-driven analysis of expression data for E. coli and for Yeast," EMBO 
Practical Course, University of Milano-Bicocca, Italy, January 13, 2003. 

74. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," Plenary Talk-Enzymes & 
Biocatalysis for Drug Discovery and Development, San Diego, CA, January 30, 2003. 

75. B.O. Palsson, "Genome-scale analysis for new metabolic engineering procedures," National 
Science Foundation Metabolic Engineering Conference, January 31, 2003. 

76. B.O. Palsson, "Bringing Genomes to Life: The Key Role of Genome-scale Computer Models," IRI 
Fronteirs of Technology -2003 Conference, February 27-28, 2003. 

77. B.O. Palsson, "Novel HTC7HTS Technology," BiolT Meeting, Boston, March 26, 2003. 

78. B.O. Palsson, "Bringing Genomes to Life: The Use of Genome-scale in silico Models," NIH 
Computational Approaches to Biological Systems Seminar Series, March 27, 2003. 

79. B.O. Palsson, "Bringing Genomes to Life: The Use of Genome-scale in silico Models," IBM 
Research Computational Biology Seminar, March 28, 2003. 

80. B.O. Palsson, "Bringing Genomes to Life: The Use of Genome-scale in silico Models," UCSD 
Bioinformatics Symposium, March 29, 2003. 

81. B.O. Palsson, "Bringing Genomes to Life: The Use of Genome-scale in silico Models," Plenary 
Talk-Northwestern University Computational Science and Engineering Spring Symposium, April 3, 
2003. 

82. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models" Duke Center for 
Bioinformatics and Computational Biology Conference, May, 2003. 

83. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," Keynote Address-ESCAT 
Meeting, Spain, May, 2003. 

84. B.O. Palsson, "Chairperson's Remarks," CHI Beyond Genome 2003, San Diego, June, 2003. 

85. B.O. Palsson, "E. coli i2K," IECA2003, Tsuruoka, Japan, June21-26, 2003. 

86. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," The Nyhan Center 
Planning Meeting, Palo Alto, September, 2003. 

87. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," GTL Data Standards 
Workshop, San Francisco, CA, September, 2003. 

88. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," ERATO Kitano Project, 
Tokyo, Japan, September, 2003. 

89. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," Southern California 
Biotechnology Symposium, Laguna Beach, CA, September, 2003. 

90. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," NAE Annual Meeting, 
Washington, DC, October, 2003. 

91. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," International Symposium 
on New Horizons in Molecular Sciences and Systems: An Integrated Approach, Okinawa, Japan, 
October, 2003. 

92. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models," IBC Systems Biology for 
Drug Discovery and Development, Boston, MA, October, 2003. 

93. B.O. Palsson, "Bringing Genomes to Life: The Use of in silico Models,", NIH SysBio SIG Retreat, 
Washington, DC, November, 2003. 
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94. B.O. Palsson, "Systems Biology and Genetic Circuits," 2004 Gordon Conference in Molecular 
Evolution, Ventura, CA, February, 2004. 

95. B.O. Palsson, "Metabolomics," Mass Spectrometry in Systems Biology, Keynote Address, Santa 
Fe, NM, February, 2004. 

96. B.O. Palsson, "Bringing Genomes to Life: The Use of Genome-Scale in silico Models," Rutgers 
Collaborates, XIV, Rutgers University, March, 2004. 
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Toward a Science of Metabolic Engineering 

James E. Bailey 



Application of recombinant DNA methods to restructure 
metabolic networks can improve production of metabo- 
lite and protein products by altering pathway distribu- 
tions and rates. Recruitment of heterologous proteins 
enables extension of existing pathways to obtain new 
chemical products, alter posttranslational protein pro- 
cessing, and degrade recalcitrant wastes. Although some 
of the experimental and mathematical tools required for 
rational metabolic engineering are available, complex cel- 
lular responses to genetic perturbations can complicate 
predictive design. 



THE METABOLIC ACTIVITIES OF LIVING CELLS ARE ACCOM- 
plished by a regulated, highly coupled network of -1000 
enzyme-catalyzed reactions and selective membrane trans- 
port systems. However, metabolic networks that evolved in natural 
settings are not genetically optimized for the objectives important in 
practical applications. Hence, performance of bioprocesses can be 
enhanced by genetic modification of the cells. 

Metabolic engineering is the improvement of cellular activities by 
manipulation of enzymatic, transport, and regulatory functions of 
the cell with the use of recombinant DNA technology. The oppor- 
tunity to introduce heterologous genes and regulatory elements 
distinguishes metabolic engineering from traditional genetic ap- 
proaches to improve the strain. This capability enables construction 
of metabolic configurations with novel and often beneficial charac- 
teristics. Cell function can also be modified through precisely 
targeted alterations in normal cellular activities. Examples in the 
manipulation of protein processing pathways, as well as of pathways 
involving smaller metabolites, will be highlighted here. 

At present, metabolic engineering is more a collection of examples 
than a codified science. Results to date promise future technological 
benefits, as well as contributions to basic science, agriculture, and 
medicine. However, many studies have shown the feasibility of 
metabolic engineering methods without achieving the yields, rates, 
or titers (final concentrations) required for a practical process. Most 
experiments explore changes in a single gene, operon, or gene 
cluster. After a new strain has been created by such a manipulation, 
limitations arise that can in principle be addressed by subsequent 
genetic manipulation. An iterative cycle of a genetic change, an 
analysis of the consequences, and a design of a further change, 
analogous to that articulated for protein engineering (J), can be 
used to find an optimized strain. The few cases to date in which such 
a metabolic engineering cycle has been implemented have achieved 
success. An emerging base of strategies, tools, and experiences will 
aid in identifying, implementing, and refining which particular set of 
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genetic manipulations is most effective in accomplishing a desired 
change in cellular function. 

Recruiting Heterologous Activities for Strain 
Improvement 

Cloning and expression of heterologous genes can serve several 
useful purposes, including extending an existing pathway to obtain 
a new product, creating arrays of enzymatic activities that synthesize g 
a novel structure, shifting metabolite flow toward a desired product, ° 
and accelerating a rate-determining step. Introduction of a function- w - 
al heterologous enzyme or transport system into an organism can _c 
result in the appearance of new compounds that may subsequently ^ 
undergo further reactions. Difficulties in anticipating these further 2 
reactions are a central limitation of metabolic engineering. g 

Expression of a heterologous protein does not guarantee appear- p 
ance of the desired activity. The protein must avoid proteolysis, fold q 
properly, accomplish any necessary assembly and prosthetic group g 5 
acquisition, be suitably localized, have access to all required sub- E 
strates, and not encounter an inhibitory environment. Despite these o 
potential barriers to the successful recruitment of heterologous m 
cellular activities, the number and scope of positive experiments [o 
encourage further application of this approach. | 

Synthesis of new products is enabled by completion of partial pathways. | 
The genetic and metabolic diversity that exists in nature provides a c: 
collection of organisms with a spectrum of substrate assimilation g 
and product synthesis capabilities. However, many natural strains ^ 
are imperfect from an applied perspective. Their performance can % 
sometimes be enhanced by extension of their native pathways, g 
Native metabolites can be converted to preferred end products by 'c 
the genetic installation of a few well-chosen heterologous activities § 
(Table 1). Q 

For example, the final precursor in a current commercial process 
for ascorbic acid (vitamin C) synthesis is 2-keto-L-gulonic acid 
(2-KLG). One route to 2-KLG involves two successive fermenta- 
tions. The first converts glucose to 2,5-diketo-D-gluconic acid 
(2,5-DKG) in Erwinia herbicola; the second fermentation, carried 
out in a species of Corynebacterium, transforms 2,5-DKG to 2-KLG. 
Researchers devised a way to convert glucose to 2-KLG in a single 
fermentation step by cloning the Corynebacterium enzyme 2,5-DKG 
reductase, which catalyzes the 2,5-DKG to 2-KLG conversion, into 
E. herbicola (2). A similar goal was achieved for 7-aminocephalo- 
sporanic acid (7ACA), the precursor for several semisynthetic 
cephem antibiotics (3). 

Posttranslational modifications can influence the function of 
proteins. The types of modifications that occur can be affected by 
expression of cloned protein processing enzymes. For example, expres- 
sion in Chinese hamster ovary (CHO) cells of p-galactoside o2,6- 
sialyltransferase (4) allows the formation of sialyl ot2,6-galactosyl 
linkages on its surface glycoproteins. These terminal glycosylation 
linkages are normally absent from proteins produced in this industrial 
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cell line, including cloned erythropoietin. Thus, this strategy should 
enable erythropoietin made in recombinant CHO cells to more closely 
resemble human erythropoietin, which is rich in these linkages (5). In 
another study, mouse cells displayed the human H blood group 
antigen after transfection with human DNA (6). 

Transferring multistep pathways: Hybrid metabolic networks. The trans- 
fer of genes that encode entire biosynthetic pathways to a heterolo- 
gous host can provide more industrially robust strains, enhance 
productivity, or permit the use of less costly raw materials. Moreover, 
such experiments are useful for exploring the regulation and function 
of a multistep metabolic pathway in a particular species. 

Transferring entire antibiotic biosynthetic pathways to heterolo- 
gous hosts has been facilitated by the clustering of the genes 
involved (7). Genes for the biosynthesis of actdnorhodin were 
transferred from Streptomyces coelicolor to Streptomyces lividans, en- 
abling the latter strain to produce actinorhodin (8). Subsequently, 
clustered erythromycin biosynthetic genes from Streptomyces eryth- 
reus were transferred to S. lividans, which then synthesized an 
antibiotic indistinguishable from erythromycin A (9). Escherichia coli 
carrying this cloned gene cluster did not synthesize the antibiotic, 
possibly because of low transcriptional activity of Streptomyces 
promoters in E. coli. The fungi Neurospora crassa and Aspergillus 
niger, which normally do not produce ^-lactam antibiotics, synthe- 
sized penicillin V after transformation with a cosmid containing 
Penicillium chrysogenum DNA that encoded enzymes in the penicillin 
biosynthetic pathway (10). 

Polyhydroxybutyrate (PHB), a storage product sequestered in 
large amounts by some bacteria under growth-limiting, carbon 
source-excess conditions, is a biodegradable polyester that already 
has small-scale applications. Alcaligenes eutrophus can produce not 
only PHB but, when supplied with different precursors, can synthe- 
size various polyhydroxyalkanoate copolymers as well (11). Meta- 
bolic engineering of the synthesis of these and related polymers 
should provide greater control over the nature and quantity of the 
polymer produced and should also offer alternative production 
organisms. The PHB synthesis operon from A. eutrophus, which 
encodes PHB polymerase, thiolase, and reductase activities, has been 
used to transform E. coli (12). As in A. eutrophus, this recombinant 
E. coli accumulates PHB when the nitrogen source is depleted; PHB 
concentrations in these cultures reach 50% of the dry cell weight. 

Assembly of pathways for simultaneous degradation of chloro- 
and methylaromatics by combining and refining of cloned pathway 
segments and regulatory systems from several different organisms 



Table 1. Heterologous activities recruited to alter small metabolite and 
protein end products. The original metabolite serves as the substrate for the 
synthesis of the new product through a pathway involving the new interme- 
diate. It is difficult to prove that the inserted activity alone is responsible for 



exemplifies the iterative design of an effective hybrid organism (13). 
The ultimate strain thus far constructed contains five pathway 
segments obtained from three organisms. The biochemical and 
metabolic complexities of the degradation of mixed substrates and 
the resulting rationale behind each portion of this construction offer 
useful general perspectives on metabolic engineering strategies (14). 

Creating new products and new reactants. Expression of biosynthetic 
genes for a secondary metabolite in a heterologous host that 
synthesizes its own different secondary metabolite can result in the 
construction of an array of enzymatic activities that yield novel 
products. Among the novel antibiotics that have been produced in 
recombinant strains of Streptomyces by such manipulations are 
mederrhodins A and B and dihydrogranatirhodin (15), 2-noreryth- 
romycins A, B, C, and D (16), and isovaleryl spiramycin (17). 

Compounds new to the cell that result from a heterologous 
activity often undergo further reactions. In some cases, such as in the 
biosynthesis of indigo by E. coli that express Pseudomonas putida 
naphthalene dioxygenase (18), these subsequent reactions are essen- 
tial components of the desired pathway. Another illustration of 
metabolic engineering to introduce a novel intermediate into a host 
involves recombinant E. coli that express the cloned tyrosinase gene^ 
from Streptomyces antibioticus (19). Synthesis by the recombinant E. o 
coli strain of the pigment melanin, an ultraviolet light-absorbing cm 
compound with material and cosmetic applications, depends on am" 
single critical catalytic step: the oxidation of tyrosine and L-dopa to-g 
dopaquinone by tyrosinase; the remaining reactions that yield melanin ro 
are apparendy nonenzymatic. Melanin production is increased when"^ 
another protein from S. antibioticus is coexpressed with tyrosinase, o 
Although definitive evidence is not yet available, this second protein E 5 
may provide a copper-donor function that activates apotyrosinase. ^ 
Thus, increasing the expression of a cofactor-requiring protein as part <9 
of a metabolic engineering scheme may require the engineering of an o 
increased supply of the cofactor as well. c 

Biodegradation of undesirable compounds can often be accom- § 
plished by host enzymes after a heterologous activity provides the ^ 
initial attack on the target compound or compounds. For example, 5 
the expression of Pseudomonas mendocina toluene monooxygenase in > 
E. coli enabled the efficient degradation of trichlorethylene, a ^ 
suspected carcinogen and widespread pollutant (20). In E. coli,^ 
degradation can be induced by isopropyl-l-thio-p-D-galactoside or"§ 
by a temperature shift, rather than by toluene, as occurs in P."§ 
mendocina. In addition, the engineered E. coli has degradation .2 
kinetics (no competitive inhibition, as with toluene) and cosubstrate § 

Q 

an altered phenotype. In each case discussed here, the observed change in cell 
function is consistent with the expected consequence of the newly installed 
gene or genes. A. chrysogenum, Acremonium chrysogenum; GDP, guanosine 
diphosphate, F. solani, Fusarium solani; P. diminuta, Pseudomonas diminuta. 
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requirements (glucose instead of toluene) that are superior to those 
of the native host of this toluene monooxygenase activity. This 
example illustrates that transfer of a crucial enzyme activity to a 
different regulatory environment can render that activity useful for 
biotechnology. 

New metabolites arising from the action of cloned heterologous 
enzymes may also undergo undesirable side reactions. The precursor 
of 7ACA in engineered Acremonium chrysogenum, produced as a 
consequence of the cloned D-amino acid oxidase, can also react with 
hydrogen peroxide to give a useless by-product, dramatically reduc- 
ing 7ACA yield (3). Cloned degradation enzymes have led to 
metabolic dead ends in the sense that the host cannot convert their 
products further; in some cases these recalcitrant intermediates 
inactivate key catabolic enzymes {14). Other unexpected complica- 
tions can arise when desired end products are similar to some native 
metabolite and are converted to another product by host enzymes. 
After observations of unexpectedly low yields of 2-KLG in a 
recombinant strain (2), it was found that 2-KLG was converted to 
L-idonic acid by endogenous 2-ketoaldonate reductase (2KR). 
Cloning, deletion mutagenesis, and homologous recombination of 
the mutated gene for 2KB. into the chromosome were part of several 
steps undertaken to develop an engineered organism able to accu- 
mulate large amounts of 2-KLG ( > 120 g/liter) (21, 22) . The present 
engineered metabolic pathway involving these constructs (Fig. 1) 
shows complex interactions of enzymes and substrates that were 
identified, characterized, and engineered in an iterative process. 

Perfecting strains by altering nutrient uptake and metabolite flow. 
Increased growth rates, decreased nutrient demands for cell growth, 
and higher attainable cell densities have advantages in many different 
applications. The use of metabolic engineering to realize these 
objectives has been based on increasing the efficiency of nutrient 
assimilation, enhancing the efficiency of adenosine triphosphate 
(ATP) production, and reducing the production of inhibitory 
metabolic end products. In one of the earliest applications of 
recombinant DNA to the improvement of the metabolism of the 




Fig. 1. Summary of the enzymes, intermediates, and by-products encoun- 
tered in the synthesis of 2-keto-L-gulonic acid (2-KLG) from glucose in 
genetically engineered E. herbkola. The control of cloned (white printing on 
black) 2,5-DKG reductase activity, which requires the reduced form 
(NADPH) of nicotinamide adenine dinuclcotide phosphate (NADP) sup- 
plied by the cell metabolism, directs metabolite flow to 2-KLG. Wide band 
with dots, cell membrane; straight bars, transport systems. Abbreviations are 
as follows: GDH, D-glucose dehydrogenase; GADH, D-gluconate dehydro- 
genase; 2KDGDH, 2-keto-D-gluconate dehydrogenase; DKGR, 2,5-dikcto- 
D-gluconate reductase; IADH, L-idonate dehydrogase; 5KR(G), 5-keto-D- 
gluconatc reductase (D-gluconate-producing); G, D-glucose; GA 
D-gluconatc; 2KDG, 2-keto-D-gluconatc; 2,5KDG, 2,5-diketo-D-gluconate; 
IA, L-idonate; and 5KDG, 5-keto-D-gluconate. Reprinted by permission 
from (21). 
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commercial strain, the goal was improvement of the efficiency of 
carbon conversion into cell mass by Methylophilus methylotrophus, a 
strain developed as an animal feed material. The native route of 
nitrogen assimilation used by this bacterium is the glutamate 
synthase pathway, which consumes one ATP per nitrogen incorpo- 
rated into glutamate. Nitrogen assimilation by means of glutamate 
dehydrogenase, a process absent from this organism, does not 
require ATP. In an effort to improve cell yield, glutamate dehydro- 
genase from E. colt was expressed in a glutamate synthase mutant of 
M. methylotrophus (23). The efficiency of carbon conversion was 
increased 4 to 7%. 

End products of carbon catabolism (acetate, ethanol, and lactate) 
that inhibit cell growth are produced by bacteria, yeasts, and 
mammalian cells under conditions of oxygen limitation or carbon 
source excess. The final optical density of E. coli grown under 
shake-Mask aerobic conditions was increased threefold after intro- 
duction of a plasmid that expressed pyruvate decarboxylase and 
alcohol dehydrogenase from Zymomonas mobilis (24). The former 
activity, absent in unmodified E. coli, redirects catabolite fluxes from 
pyruvate and results in a shift from acetate production, which 
strongly inhibits cell growth, to production of ethanol, which is less 
inhibitory. § 

Microbial catabolic products such as ethanol, acetone, and buta- ° 
nol are important industrial chemicals. Large increases in ethanol tn - 
yields from pentose and hexose sugar substrates from E. coli (25, 26) 
and Erwinia chrysanthemi (27) have been achieved by transformation jg 
with plasmids that encode pyruvate decarboxylase from Z. mobilis, ^ 
in some cases coexpressed with Z. mobilis alcohol dehydrogenase. g 
The E. coli so engineered have the potential practical advantages of 
rapid and efficient conversion of several sugars found in biomass o 
(26). g> 

a-Acetohydroxy acids, synthesized during fermentation by brew- E 
crs' yeast, leak into the medium where spontaneous hydroxylation o 
produces diacetyl, which has an undesirable flavor. On the basis of <u 
suggestions that the time required for beer lagering is determined by Jo 
the time required for the enzymatic reduction of diacetyl by the 5 
yeast, genes for the enzyme a-acetolactate decarboxylase (a-ALDC) | 
were cloned from Klebsiella terrigena or Enterobacter aerogenes, fused c: 
to yeast promoters, and inserted into Saccharomyces cerevisiae on o 
multicopy plasmids. This enzyme converts a-acetolactate to acetoin, 
rather than diacetyl; acetoin influences flavor only at relatively high -g 
concentrations. Pilot brewing studies with these engineered strains § 
that express a-ALDC yielded beer of quality equal to that produced c 
by controls, but in a process time of 2 weeks, as compared to 5 o 
weeks for the conventional process. The lagering step could be Q 
omitted when the recombinant brewers' yeasts were used because of 
low diacetyl production by these organisms (28). 

Enabling a cell to utilize alternative materials as nourishment is 
another capability of metabolic engineering. In order to produce 
microbial surfactants from industrial waste raw materials, E. coli 
p-galactosidase and lactose permease were stably integrated into the 
chromosome of two Pseudomonas aeroginosa strains. These recombi- 
nant strains synthesize biosurfactants when grown in lactose and 
whey-based minimal media (29). 

Yeast ornithine decarboxylase was cloned and expressed in cul- 
tured roots of Nicotiana rustica in order to direct a greater metabolite 
flux from ornithine to putrescine, a precursor of nicotine (30). Some 
clones showed approximately two times as much nicotine accumu- 
lation as the controls. Rearrangement of the native fluxes in the 
hyoscyaminc-rich Atropa belladonna was motivated by greater com- 
mercial demand for scopolamine, the 6,7-epoxidc of hyoscyamine. 
Expression of Hyoscyamus niger hyoscyamine 6^-hyclroxylase in an 
A. belladonna hairy-root clone produced three to ten times as much 
scopolamine as did wild-type clones (31). 
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has proven effective in increasing the quantity of active protein 
recovered. Overexpression of the E. coli chaperone proteins GroES 
and GroEL provided a five- to tenfold increase in assembled 
cyanobacterial Rubisco (D-ribulose-l,5-bisphosphate carboxylase/ 
oxygenase) enzyme coexpressed in E. coli. In vitro studies of 
interactions among Rubisco and the GroE proteins implicate Mg 2+ - 
ATP as a requirement for assembly (32). The failure to achieve 
assembly of Rubisco from higher plants in altered E. coli signals 
future challenges in the transfer of heterologous protein processing 
pathways (33). Other challenges include genetic manipulations of 
processing pathways of bacteria that alter the solubility of recombi- 
nant proteins (34). In addition, opportunities exist for extending 
such strategies to eukaryotic hosts (35). 

Transfer of promising natural motifs: Vitreoscilla hemoglobin. Be- 
cause of the constant drive toward maximum cell densities to 
maximize volumetric productivity, growth and product synthesis in 
many industrial processes are limited by oxygen supply. The Gram- 
negative aerobic bacterium Vitreoscilla, which lives in poorly aerated 
environments, synthesizes increased quantities of a hemoglobin 
molecule in oxygen-limited cultures (36). Although the function of 
this protein in its natural host has not been established, this pattern 
of regulation of expression, combined with the oxygen-binding and 
release characteristics of the protein, suggest a possible beneficial 
physiological activity in poorly oxygenated environments. 

Motivated by this hypothesis and the premise that this beneficial 
function might be genetically transferred to industrial microorga- 
nisms, the gene for Vitreoscilla hemoglobin (VHb) was cloned and 
expressed in E. coli (37). Escherichia coli that carried a single copy of 
this gene integrated in the chromosome synthesized total cell 
protein more rapidly than an isogenic wild-type strain in oxygen- 
limited cultivations (Fig. 2), a response attributed to an increased 
efficiency of net ATP synthesis in the hemoglobin-expressing strain 
(38). Facilitation of oxygen transfer to the respiratory center (39) 
and modification of some aspect of cellular redox chemistry (38) 
have been suggested as contributing mechanisms for these phenom- 
ena. Coexpression of VHb increases the expression of cloned 
[3-galactosidase, chloramphenicol acetyltransferase (CAT) (38), and 
a-amylase (40) by 1.5- to 3.3-fold relative to controls in oxygen- 
limited E. coli cultures, probably as a result of enhanced net ATP 
synthesis. 

Aeration of bioreactors used in the synthesis of antibiotics is 



Fig. 2. (A) Time trajec- 
tories of total E. coli cell 
protein (dashed lines) 
and cloned CAT activity 
(solid lines) in the wild 
type (open symbols) and 
an engineered host that 
expresses VHb from a 
single gene copy inte- 
grated in the chromo- 
some (closed symbols) 
in oxygen-limited fed- 
batch fermentations [re- 
printed from (38)]. (B) 
Cell densities (dashed 
lines) and concentration 
of actinorhodin in the 
medium (solid lines) in 
batch fermentations of 
S. coelicolor. Closed sym- 
bols are from a transfor- 




VHb; open symbols are from a control transfbrmant ro 
[reprinted from (41)]. 



frequently complicated by the thick broths that result from growth 
of filamentous fungi and Streptomyces. Success of the strategy for 
enhancing aerobic metabolism in other bacteria prompted, cloning 
and intracellular expression of VHb in two different Streptomyces 
species (41). Streptomyces lividans with a multicopy hemoglobin 
expression plasmid achieved final cell densities up to 54% greater 
than the un transformed host in shake-flask cultivations! The pres- 
ence of cloned intracellular VHb in S. coelicolor markedly increased 
secondary metabolite accumulation, without affecting cell growth 
relative to a control strain containing a mutated VHb gene (Fig. 2). 

These examples suggest a general genetic strategy for addressing 
stresses and corresponding productivity limitations encountered in 
bioprocessing: after identifying a response in nature to a similar 
stress (most likely involving a different organism), genes that specify 
that response can be transferred to the organism of choice. 



Redirecting Metabolite Flow 

Typically the route of reactions to a desired product passes several 
forks where intermediates can enter alternative pathways. At such 
bifurcations of metabolite flow, a common resource — for example, o 
substrate, enzyme, transport system, or ribosome — contributes to° 
two or more parallel processes. Maximizing product formation in" 
requires that the desired route at each fork be made a priority and-g 
that traffic in alternative pathways be minimized to the extent to 
possible without decreasing cell viability. ^ 

Directing traffic toward the desired branch. Amplification of the § 
activity initiating a desired process at a fork in a metabolic flow is a E> 
common strategy of metabolic engineering. Whereas isolation of ^ 
mutant enzymes that are desensitized to feedback repression was re 
achieved with classical methods, such mutants may now be obtained § 
more rapidly with the use of cloned genes. This approach also avoids c 
the complication of uncharacterized additional mutations that are— 
often obtained with classical, whole-cell mutagenesis. °? 

The past decade has seen a new generation of strain improvements | 
in amino acid-producing coryneform bacteria with metabolic engi- § 
neering (also called molecular breeding) (42, 43). Central to the E 
success achieved was the development of new vectors and transfer-^ 
mation procedures. ~° 

Genetic engineering of unproved threonine production by Breui-^ 
bacterium lactofermentum illustrates some of the strategies useful in ° 
redirecting metabolite flow to the desired product. Figure 3 presents % 
an abbreviated diagram of the reactions involved in the synthesis ofQ 
the aspartate family of amino acids and a few key reactions that feed 
into the synthesis pathway for this family. Homoserine dehydroge- 
nase (HD) was amplified by cloning and transformation into a 
threonine- and lysine-producing mutant (designated M-15). This 
mutant organism was selected for its lack of feedback inhibition of 
aspartokinase by threonine and lysine and of HD by threonine (44). 
The respective final concentrations of threonine, homoserine, and 
lysine from benchtop fermentations were 25.0, 2.8, and 1.1 g/liter 
for the recombinant strain compared to 17.5, 0.5, and 12.1 for 
M-15. Subsequent further engineering to coexpress cloned homo- 
serine kinase (HK) with HD further increased the final threonine 
concentration to 33 g/liter and reduced homoserine and lysine levels, 
relative to the strain with cloned HD alone (45). In another study, 
the coryneform gene for HD was mutagenized to eliminate feedback 
inhibition by threonine. Introduction of this mutated HD gene into 
a lysine producer shifted the final lysine concentration from 65 
g/liter to 4 g/liter and the final threonine concentration from 0 g/liter 
to 52 g/liter (43). Threonine production by M-15 was increased 
12% by the expression of cloned phosphoenolpyruvate (PEP) 
carboxylase (PEPCase) (46). This manipulation was motivated by 
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Fig. 3. Pathways of the biosynthesis of the aspar- 
tate family of amino acids. Metabolite abbrevia- 
tions are as follows: acetyl Co A, acetyl coenzyme 
A; TCA, tricarboxylic acid cycle; Asp, aspartate; 
ASA, aspartate semialdehyde; Hse, homoserine; 
Lys, lysine; Met, methionine; Hse-P, O-phospho- 
homoserine; and Thr, threonine. 
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the researchers' desire to increase oxaloacetate (OAA) production 
and thereby to increase carbon flow into amino acid production. 
Further improvements in rates of amino acid synthesis and yields 
will depend on a better understanding of mechanisms of regulation 
of gene expression and metabolite flow in these bacteria (47, 48). 

Because of metabolic engineering, E. coli has become an indus- 
trially important producer of amino acids. Transformation by 
multicopy plasmids that contain tryptophan (49) and threonine (50) 
biosynthetic genes have increased production of these amino acids. 
A project to engineer phenylalanine production in E. coli showed 
that overexpression of some genes in the phenylalanine biosynthetic 
pathway could cause a decrease in phenylalanine production and 
that inducible excision vector technology can be used to manipulate 
the biosynthesis of tyrosine, an inhibitor of the desired pathway 

(51) . An intermediate in a metabolic pathway can be overproduced 
by combining a mutation that blocks that intermediate's use by the 
cell and by genetic augmentation of precursor flow into that 
pathway; although this concept has been used extensively in classical 
genetic production of organisms that are amino acid overproducers, 
it can be implemented in other contexts by metabolic engineering 

(52) . 

The gene eryF in Saccharopolyspora erythrae encodes the first 
enzyme in the pathway from 6-deoxyeiythronolide B to the antibi- 
otic erythromycin. After the targeted disruption of this gene using 
an integrative plasmid, 6-deoxyerythronolide B was converted to an 
erythromycin derivative that is more stable at the low pH of the 
stomach (53). 

Because enzyme activities involved in secondary metabolite pro- 
duction are regulated at both the gene and protein levels, identifying 
genetic changes that accelerate synthesis of these metabolites is 
challenging. One successful strategy is based on measuring the 
biosynthetic pathway intermediate concentrations in the growth 
medium. Relatively high extracellular concentrations of the inter- 
mediate penicillin N suggested that the activity that converts this 
intermediate to cephalosporin C (encoded in cefEF) may limit the 
rate of the overall pathway (54). Thus, expression of cefEF was 
elevated through increased gene dosage in a production strain of 
Cephalosporium acremonium. This recombinant fungus exhibited a 
15-fold reduction in penicillin N production and an increase of 
~15% in cephalosporin C production. 

Routing through protein processing pathways has also been 
altered by manipulation of host genes. The overproduction lethality 
commonly observed with exported p-galactosidase fusion proteins 
in E. coli is suppressed by the overproduction of E. coli prW (55), 
and the expression of E. coli DnaK enables export of /acZ-hybrid 
proteins that are otherwise confined to the cytoplasm (56). An 
NH 2 -terminal methionine often differentiates cloned polypeptides 
synthesized in E. coli from their native human counterparts. Coex- 
pression of cloned E. coli methionine aminopeptidase with human 
interleukin-2 in E. coli has substantially reduced the fraction of 
product with methionine at its NH 2 - terminus (57). Observation of 
large quantities of a variant of human tissue plasminogen activator 



(tPA) associated with GRP78 in the rough endoplasmic reticulum 
of CHO cells suggested that GRP78 binding was a rate-limiting 
step in tPA secretion. (GRP78 is the 78-kD glucose-regulated 
protein, one of the stress-response proteins.) Coexpression of 
antisense GRP78 message resulted in smaller quantities of GRP78 
and faster tPA secretion (58). 

A quantitative study was conducted of S. cerevisiae isolates that 
contain different numbers of the phosphoglycerate kinase (PGK) 
gene (PGK1). In some cases a 10 to 15% increase in PGK activity 
gave rise to a higher (30%) overall cell mass yield when the yeast § 
were grown on glucose. However, in another construct that con- 
tained more copies of PGK1, yield was depressed by 40% (59). in - 
These results show the importance of fine-tuning the amount of x: 
gene or enzyme amplification to achieve the desired benefit. 

Reducing competition for a limiting resource. Computer simulations 2 
with a detailed single-cell model suggested that competition be- g 
tween vector- and host-encoded messages for a common pool of o> 
ribosomes could limit cloned gene expression (60), a prediction q 
consistent with experimental observations of reduced ribosome ^ 
content in recombinant E. coli that contain more plasmids per cell £ 
(61, 62). By expressing a cloned mutant 16S ribosomal RNA, a o 
population of ribosomes is created that is specialized for expression .<£ 
of only those cloned gene transcripts that bear a corresponding tn 
mutation in the Shine-Dalgarno sequence (63). Large amounts of 5 
messenger RNA transcribed from the cloned gene will not interact | 
with the primary population of native ribosomes; therefore the c: 
expression of cloned genes will not interfere with the simultaneous 2 
expression of host cell genes important for protein synthesis. With -o 
this approach, expression of cloned 0-galactosidase was increased by -§ 
35%. A 30% reduction in specific growth rate occurred after § 
expression of the specialized 16S RNA; however, no growth rate |j 
reduction was observed on induction of cloned 0-galactosidase o 
synthesis (64). Q 

Revising metabolic regulation. Positive control genes have been 
found so far in the biosynthetic gene clusters for actinorhodin, 
bialaphos, streptomycin, and undecylprodigiosin, which are all 
secondary metabolites produced by Streptomyces species. Cloning 
additional copies of an activator gene in the wild-type host can 
substantially increase antibiotic production, as indicated for unde- 
cylprodigiosin (8). 

Modification of the regulation of expression of maltose permease 
and maltase is the basis for a metabolically engineered baker's yeast 
intended to reduce the time for leavening of sweet doughs (65). 
Glucose normally represses expression of these proteins and thereby 
blocks simultaneous maltose utilization. The engineered strain uses 
constitutive yeast promoters for these enzymes to enable simulta- 
neous uptake and catabolism of both sugars. This is one of few 
examples in which a transport system has been manipulated. 

A genetic engineering strategy to stimulate C0 2 production by 
bakers' yeast seeks to consume ATP (66). This could relieve ATP 
inhibition of phosphofructokinase (PFK) and pyruvate kinase, two 
regulatory enzymes in sugar catabolism. A futile cycle with PFK 
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was created by expressing cloned yeast fructose 1,6-bisphophatase 
(FBPase) from a yeast glycerophosphate dehydrogenase promoter 
that is induced by glucose; FBPase is not normally expressed in the 
presence of high glucose concentrations. This yeast strain produced 
20 to 25% more C0 2 than the wild type. 

In order to construct a pathway for processing the pollutant 
4-ethylbenzoate, it was necessary to alter the regulation of the 
alkylbenzoate degradation pathway encoded on the Pseudomonas 
TOL plasmid (67). Originally 4-ethylbenzoate did not induce 
transcription of the crucial meta operon. Mutations were introduced 
into the positive regulator of Pm (the promoter of the meta operon) 
that enabled Pm activation by 4-ethyibenzoate. 

Completing the Metabolic Engineering Cycle: 
Potentials and Perils of Rational Design 

The iterative cycle of genetic modification, analysis of the meta- 
bolic consequences of this change, and choice of the next genetic 
modification has been successfully implemented in a few instances 
with promising results. Contemporary concepts and technologies 
for each function in this cycle are summarized next. 

Cloning in industrial strains. The lack of suitable vectors and 
methods for the introduction of exogenous DNA limits the appli- 
cation of metabolic engineering in many important industrial orga- 
nisms (68). Electroporation and conjugation have proven useful in 
introducing DNA into diverse organisms. The stable propagation of 
cloned genes remains problematic even in such a well-studied system 
as Bacillus subtilis and is apparendy a result of the error-prone 
rolling-circle replication mechanism used by many plasmids in 
Gram-positive bacteria. Extensive rearrangements and deletions of 
both chromosomal and plasmid DNA occur frequendy in some 
species, complicating their systematic manipulation. Restriction 
(cleavage) of heterologous DNA is a limitation in efficient engineer- 
ing of many cells of practical interest. 

The hurdles to be surmounted in developing the necessary genetic 
tools are illustrated in research that is establishing a foundation for 
engineering the complex catabolic metabolism of Clostridium aceto- 
butylkum. This bacterium is the basis for the biological production 
of the industrial chemicals acetone and butanol. Efficient transfor- 
mation of this organism required optimization of an electroporation 
protocol, and it was discovered that, because of a clostridial restric- 
tion enzyme system, E. coli is not a suitable organism for the cloning 
of clostridial DNAs, whereas B. subtilis is (69). Technology for 
chromosomal integration should soon follow, as several C. acetobu- 
tylicum genes have now been cloned. 

Dissecting physiological responses. For the most effective design of a 
subsequent genetic manipulation, it is useful to know the concen- 
trations of intracellular proteins and metabolites. The concentrations 
of many cellular proteins can be determined in principle from 
two-dimensional gel electrophoresis, but data bases are necessary to 
identify individual proteins (70). In vitro assays of changes in 
activities of key enzymes have been widely applied. 

A broad spectrum of analytical methods can be applied for 
determining metabolite concentrations. The measurement of con- 
centrations and in some cases of fluxes in particular pathways of 
interest can often be aided by the application of isotopically labeled 
precursors. For example, with the use of labeled acetate and 
glutamate, along with quasi-steady state conservation equations for 
intracellular metabolites, the velocities for carbon flow through E. 
coli growing on acetate can be determined (71). 

Nuclear magnetic resonance (NMR) spectroscopy has been ap- 
plied to estimate metabolite concentrations in whole cells, cell 
extracts, and growth media (72). For example, 31 P NMR measure- 



ments of S. cerevisiae cells converting glucose to end products under 
anaerobic conditions, in concert with a methodology for extracting 
individual component information from the sugar phosphate por- 
tion of the spectrum, provided estimates of metabolite concentra- 
tions that were essential for analysis of the pathway (73). The time 
and instrumentation required to evaluate metabolite concentrations 
presently limits rational metabolic engineering. 

Design principles and cell models: Coping with complexity and 
coupling. No universal principles have emerged from metabolic 
engineering research to guide the choice of the next useful genetic 
alteration. Attempts to address these problems with artificial intel- 
ligence have shown that there is no substitute for knowledge of the 
pathways involved, their regulation, and their kinetics. Some useful 
approaches include measurements of intermediate concentrations to 
indicate possible rate-determining reactions, genetic transfer of 
natural stress response motifs, and applications of organisms that 
can be used over wide ranges of temperature and pH (74). 

Alternatively, if a mathematical description of the system is 
available, sensitivity analysis can be applied to calculate the expected 
response of the pathway to changes in the individual steps or 
pathway segments. An advantage of such an approach is its simul- 
taneous determination of the sensitivities of the desired flux to many o 
different participating reactions, permitting the identification of ° 
situations in which several genetic modifications in concert areun 
required to achieve a desired response. -g 

A body of theoretical developments known as metabolic control aj 
theory is well suited to the requirements of rational metabolic 2 
engineering (75). A central result provided by this theory is a § 
sensitivity calculation that provides the flux control coefficients, g> 
defined as the fractional changes of flux expected for a unit fractional °. 
change in the amount of each enzyme participating in a given to 
pathway. In addition, it is possible to evaluate the sensitivity of flux [jj 
through the pathway to individual parameters in kinetic expressions c 
for each of the enzymes, thereby providing guidance for useful— 
protein engineering to accelerate the pathway. Analysis of several "? 
simple examples that involve unbranched sequences of reactions | 
showed that sole control of flux by any single step (in other words, % 
the existence of a single, rate-limiting step) is in general notE 
expected. Instead, the flux through the pathway is usually influenced 2 
by the activities of several individual steps. This result, augmented "g 
by specific model calculations outlined below, provides motivation "o 
for sequential improvement of metabolic pathways. o 

In one of the few cases in which detailed kinetic expressions for | 
each step in the reaction network, as well as the concentrations of all^ 
substrates and effectors, are known or estimated, flux control 
coefficients were determined for nongrowing yeast converting glu- 
cose to ethanol and other end products (76). Several general points 
are suggested by this investigation. First, the sensitivity of pathway 
flux to individual step changes depends on the environment in 
which the cell is grown. It is therefore important to carry out 
modeling and measurement under the expected industrial condi- 
tions. Second, flux control can be extraordinarily sensitive to some 
parameters such as intracellular pH. Third, in any system with 
interacting pathways (and it is difficult to envision any case where 
this does not occur), the most general version of metabolic control 
theory must be used (77). Such coupling is extensive in the usual 
case of growing cells, where the pathway of interest interacts with all 
of the other metabolic processes in the cell. A strategy for accom- 
plishing flux control coefficient calculations in this situation has been 
presented (78). Finally, calculations with the kinetic model formu- 
lated in a yeast biocatalysis study indicate that amplification of the 
activity of one enzyme results in a shift of flux control to other steps 
in the pathway. New theory that presumes linear approximations for 
all rate expressions provides estimates of flux control coefficients, 
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without requiring knowledge of kinetics and using time-resolved 
metabolite concentration measurements instead (79); the practical 
merits of this approach have not yet been evaluated. 

Both sensitivity to small changes and simulations of responses to 
large changes in intracellular activities can be calculated from a 
detailed and reliable mathematical model of the cell. Large quantities 
of biological information have been integrated into computer 
models for single cells (60, 80) and several molecular control systems 
(81). These have successfully simulated consequences of several 
genetic and environmental changes. Although useful initial direc- 
tions for genetic improvement have been suggested by such models, 
they have not yet been used as the central tool in an iterative 
metabolic engineering study. In spite of their obvious limitations, 
these mathematical structures are the only way that the net conse- 
quences of simultaneous, coupled, and often counteracting pro- 
cesses can be simulated and evaluated consistently and quantitative- 

Minimizing response cascades. Unanticipated cell responses to a 
genetic modification may complicate rational practice of the meta- 
bolic engineering cycle. Introduction of a cloning vector alone may 
result in a large cascade of metabolic changes, many of which are 
difficult to anticipate. For example, the introduction of multicopy 
plasmids into E. coli, even without overexpression of a cloned 
product, has been shown to cause substantial changes in growth 
rates, cell cycle regulation, amounts of many individual proteins, 
glucose uptake, and carbon catabolite production rates (82). Trans- 
formation of yeast with multicopy plasmids can introduce lesions 
that persist after the plasmids have been eliminated from the 
transformants (83). Different mammalian cell clones transfected by 
the same vector often exhibit different growth rates and cell sizes. 
Therefore, introduction of a desired genetic change should be 
carefully configured to minimize perturbation of the host, using the 
lowest gene dosage and lowest expression level that give the desired 
result. The apparatus used for selecting the modified strain should 
also be carefully considered. For example, ampicillin resistance used 
for the maintenance of many laboratory recombinant E. coli strains 
is provided by cloned p-lactamase. This precursor must be processed 
at the cytoplasmic membrane in competition with host cell prepro- 
tein, which often results in a major physiological disruption. 

Even if the genetic manipulation is accomplished in a relatively 
well-controlled fashion, the regulatory apparatus of the cell at both 
the gene and protein levels may confound the intended change or 
even alter cellular activities. For example, amplification of citrate 
synthase in E. coli did not increase the flux through the citric acid 
cycle because of a compensating modulation of the activity of 
isocitrate dehydrogenase (84). Expression of even low concentra- 
tions of unnatural proteins can activate stress responses, influencing 
many cell functions (85). Anticipating and accounting for such 
regulatory responses to genetic intrusions are fundamental challeng- 
es for the future. 
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Network Rigidity and Metabolic Engineering 
in Metabolite Overproduction 

Gregory Stephanopoulos and Joseph J. Vallino* 



In order to enhance the yield and productivity of metab- 
olite production, researchers have focused almost exclu- 
sively on enzyme amplification or other modifications of 
the product pathway. However, overproduction of many 
metabolites requires significant redirection of flux distri- 
butions in the primary metabolism, which may not readily 
occur following product deregulation because metabolic 
pathways have evolved to exhibit control architectures 
that resist flux alterations at branch points. This problem 
can be addressed through the use of some general con- 
cepts of metabolic rigidity, which include a means for 
identifying and removing rigid branch points within an 
experimental framework. 



ALL ORGANISMS USE PRIMARY METABOLIC PATHWAYS TO 
supply precursor metabolites and energy to anabolic path- 
ways that synthesize cellular constituents that are necessary 
for growth and maintenance. In many industrial strains of microor- 
ganisms (as well as tissue and plant cultures), these anabolic 
pathways have been exploited for the overproduction of compounds 
(such as amino and nucleic acids, antibiotics, vitamins, enzymes, and 
proteins) that cannot be synthetically produced or for which it is not 
economical to do so. In general, a particular metabolite is overpro- 
duced by deregulating the pathway directly associated with the 
synthesis of that metabolite, or, more recendy, by transforming a 
robust host organism (typically Escherichia coli) with the genes that 
encode for the synthesis of the desired product (J, 2). This 
approach, however, does not necessarily result in high product 
yields (defined as the moles of product formed per mole of substrate 
consumed) since carbon flux distributions at key branch points 
(nodes) in the primary metabolism [such as glycolysis, tricarboxylic 

The authors are in the Di 



acid (TCA) cycle, and pentose phosphate pathway] must often be a 

radically redirected from the flux distributions that are normally Q 

associated with balanced growth. Such metabolic flux alterations are c 

often dirccdy opposed by mechanisms for controlling enzyme E 

activity that have evolved to maintain flux distributions that are c 

optimal for growth. We refer to this inherent resistance to flux | 

alterations as metabolic or network rigidity and to the genetic a 

modifications of specific nodes in the primary metabolism for the c 
purpose of enhancing yield and productivity as metabolic engineer- 

ing (3). Although genetic manipulations can now be readily per- ^ 

formed, there are relatively few accounts of successful metabolic flux | 

alterations because of the complex, nonlinear nature of the metabol- ~ 

ic control architectures. ^ 

The nature and types of metabolic rigidity are reviewed in this >t 

article along with methods to identify and possibly circumvent such \ 

undesirable nodal controls. The overproduction of lysine by Coryne- n 

bacterium glutamicum [and related strains (4)] is used as a vehicle to -| 

illustrate key points because of: (i) the lack of compartmentalization I 

in bacteria; (ii) the need for significant flux alterations to optimize c 
lysine biosynthesis; and (iii) the apparent marginal success of 
mutation-selection (5, 6) or genetic engineering (7) techniques used 
to that end. The concepts, however, are of general value, and the 
methods are applicable to other metabolic products as well. 



Basis of Metabolic Rigidity 

Although intracellular metabolite concentrations can fluctuate 
during growth, on average, the distributions of the major cellular 
groups (proteins, RNA, DNA, lipids, and so forth) remain relatively 
proportional to one another throughout balanced growth (*). In 
fact, metabolites and energy required to synthesize an E. coli cell 
have been calculated on the basis of its known composition (9). In 
order to preserve this regularity in cellular composition, the primary 
metabolism has evolved coordination of pathway control, such that 
building-block metabolites, energy [such as adenosine triphosphate 
(ATP)], and biosynthetic reducing power [such as nicotinamide 
adenine dinucleotide phosphate (NADPH)] arc synthesized in 
approximate stoichiometric ratios during balanced growth. Al- 
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IN MEMORIAM 



Jay Bailey (1944-2001) 



Jay Bailey passed away in Zurich on 9 May 2001 at the 
age of 57. Jay received his education at Rice University, 
graduating with a B.A. in 1966 and Ph.D. in 1969, both in 
chemical engineering. After a short period with Shell 
Development he joined the Chemical Engineering Faculties 
of the University of Houston in 1971 and Caltech in 1980. 
In 1992 Jay was appointed Professor of Biotechnology at 
the Swiss Federal Institute of Technology (ETH) in Zurich. 

In his early years Jay studied extensively the dynamics of 
chemical reactions and reaction networks with particular 
emphasis on the origin and interpretation of autonomous 
and forced oscillations of chemically reacting systems. 
During the 1970s, his interest shifted gradually to biological 
systems and this shift was culminated with the publication 
in 1977 of the landmark textbook Biochemical Engineering 
Fundamentals (coauthored with D. F. Ollis). At the time of 
his passing, Jay had coauthored approximately 400 publi- 
cations, mostly in the field of biotechnology, including 
many seminal papers and visionary commentaries in leading 
journals. 

Besides his outstanding contributions to biotechnology, 
Jay was instrumental in defining and building the founda- 
tions of metabolic engineering. He is arguably the first 
engineer who, in the early 1980s, embraced and promoted 
genetic engineering as an enabling new technology for 
improving cellular biocatalysts for industrial processes. This 
naturally led to the need to study the behavior of bioreac- 
tion networks in their entirety, analyze metabolic flux and 
flux control, and rigorously describe the physiology of wild- 
type and recombinant microorganisms. Although some of 
these questions had been investigated before by biochemists 
and chemical engineers, their sum total emerged as a distinct 
new field, metabolic engineering. Jay played a key role in 
setting the foundations and expanding this field. 

Until the last day before being admitted to the hospital, 
Jay was actively working on the program of the 4th Con- 
ference of Metabolic Engineering, which he would have 
chaired in October of 2002. His plans for the conference 
program as well as his more recent scientific writings reveal 
a broader vision for metabolic engineering: First, metabolic 
engineering is at the forefront of functional genomics due 
to its efforts to describe the cellular physiology of wild- 
type microorganisms and recombinants with well-defined 
genetic backgrounds. Second, metabolic engineering pro- 



vides an ideal integrating platform of genomic and phy- 
siological information and data by taking a holistic view of 
metabolic networks and cellular physiology for the iden- 
tification of targets for genetic manipulation. As such, it 
is contributing concepts and methods of importance to 
systems biology. Finally, Jay expanded metabolic engi- 
neering beyond the original context of industrial strain 
improvement to also include mammalian cells, tissues, and 
medical applications. 

Jay leaves a very rich legacy to the biotechnology and 
biochemical engineering communities. He mentored a 
school of distinguished students with brilliant careers in 
industry and academia and, through his teaching and writ- 
ings, ushered biochemical engineering into the modern era 
of cellular and molecular biotechnology. Most importantly, 
he upheld the highest standards in education and research. 
These contributions make Jay undoubtedly the most 
influential biochemical engineer of modern time. 

Jay was internationally recognized for his work in diverse 
areas of chemical engineering and biotechnology. He won 
numerous honors and awards in his career. The most recent 
was the First Merck Award in Metabolic Engineering. His 
award acceptance speech summarized his vision for meta- 
bolic engineering and formed the basis of the first perspec- 
tive article in this journal (Metab. Eng. 3, 111-114, 2001). 
We are pleased to be able to reprint below the rendition of 
Bob Dylan's song They Are A Changin' by singer-poet Jay 
Bailey. 

In Jay's honor, Metabolic Engineering is instituting a 
Young Investigator Best-Paper Award in Metabolic Engi- 
neering to be awarded every 2 years to the author(s) of an 
outstanding paper published in Metabolic Engineering. 
Details about the administration of the award will be 
provided in a future issue. 

The (Metabolic Engineering) Times, 

They Are A 'Changin' (Dylan/Bailey) 

Come gather 'round people wherever you roam, 
And admit that the waters around you have grown, 
And accept it that soon you'll be drenched to the bone. 
If your time to you is worth saving, 

Then you'd better start swimming or you'll sink like a stone, 
For the times, they are a changin'. (Bob Dylan) 
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In Memoriam 



Come gather Metabolic Engineers 'cross the land 

At ME III we'll take command 

Of cells that are too slow to produce or grow. 

If it's higher fluxes you re needin' 

Then we'll shift the controls, and block bad outflows. 

For the times, they are a changin. 

Do you need a new molecule or neutraceutical 
The Metabolic Engineer has the answers for you. 
We'll import new pathways, and shuffle them too. 
Is your lead compound library fadin? 
We'll give new adducts to your old natural products 
For the times, they are a changin. 

Rational or random, which way is best? 
Solving the problem passes the test. 
Complex responses confuse the quest. 
More genetic and array technologies 
Will give us insights to networks' delights. 
For the times, they are a changin. 



Genomes are in hand, the sequences there, 

An amazing resource that we all share. 

Genes and controllers, bioinformatics tells us where. 

But how is all of this workin? 

Let's decipher a yeast, understand that at least. 

For the times, they are a changin'. 

How is phenotype controlled by the genes? 
Nobody knows, least of all the machines. 
Medicine will thrive if we can discover the means, 
To merge our knowledge and information 
And find genes' intent and control by environment. 
For the times, they are a changin'. 

Metabolic Engineers have all the tools — 
Biology, computing, and engineering rules, 
Knowledge, experience, perspective on detail. 
Let's help Metabolic Genomics to set sail. 
Opportunity's here ...but now it's time for a beer! 
For the times, they are a changin. (Jay Bailey) 

October 2000 



Videocassettes of Jay's Merck Award talk at the 3rd 
Metabolic Engineering Conference are available through the 
Engineering Foundation. Contact Ms. Barbara Hickernel 
at 212-591-7836 or engfnd@aol.com. 

Gregory Stephanopoulos 
Co-Editor, Metabolic Engineering 
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Abstract: James (Jay) E. Bailey was a pioneer in biotech- 
nology and biochemical engineering. During his 30 years 
in academia he made seminal contributions to many 
fields of chemical engineering science, including cataly- 
sis and reaction engineering, bioprocess engineering, 
mathematical modeling of cellular processes, recombi- 
nant DNA technology, enzyme engineering, and meta- 
bolic engineering. This article celebrates some of his 
contributions to the engineering of molecular and cellu- 
lar biocatalysts, and identifies the influence he had on 
current and future research in biotechnology. © 2002 
Wiley Periodicals, Inc. Biotechnol Bioeng 79: 490-495, 
2002. 

Keywords: James Bailey; contributions to biotechnology 
and biochemical engineering; biocatalysis; metabolic en- 
gineering 

INTRODUCTION 

James (Jay) E. Bailey, a pioneer in biotechnology and bio- 
chemical engineering, succumbed to metastatic cancer on 
May 9, 2001. After receiving his undergraduate and doc- 
toral degrees at Rice University and a brief stint at Shell 
Development (Houston), he started his academic career in 
1971 at the University of Houston. In 1980 he moved to the 
California Institute of Technology, and in 1992 he moved 
yet again to the Swiss Federal Institute of Technology 
(ETH, Zurich) where he was Professor of Biotechnology 
until his untimely death. During his 30 years in academia he 
made seminal contributions to many fields of chemical en- 
gineering science, including catalysis and reaction engineer- 
ing, bioprocess engineering, mathematical modeling of cel- 
lular processes, recombinant DNA technology, enzyme en- 
gineering, and metabolic engineering. Here we summarize 
some of his contributions to the engineering of molecular 
and cellular biocatalysts. 

Bailey's growing interest in biological systems in the 
1970's and 1980's was no doubt due to his interest in 
chemical catalysis. So it was no surprise that he moved into 
the fields of enzyme technology and fermentation/cell cul- 
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ture. In essence, these areas are catalytic; whether it is in the 
use of an enzyme to catalyze biotransformations or a cell to 
produce specific compounds. Although he did not use en- 
zymes synthetically, he did set the stage for research that 
would lead to several generations of enzymologists, chem- 
ists, and biochemical engineers who use enzymes in organic 
synthesis. By laying the quantitative groundwork for immo- 
bilized enzyme systems, and studying their structure and 
function in synthetically relevant forms, Bailey elevated 
biocatalysis from a synthetic oddity to a promising technol- 
ogy- 

In the mid- 1 980' s, shortly after Bailey started to acquaint 
himself with the principles and practice of emerging recom- 
binant DNA technologies, he came to realize that the sci- 
ence of metabolism presented a particularly fertile ground 
for a chemical engineer to apply this new technology. To 
emphasize the marriage between metabolic biochemistry 
and chemical engineering science, he coined the term 
"metabolic engineering." By the mid-1980's he had set his 
sights squarely on the twin goals of developing the funda- 
mentals of metabolic engineering and identifying interesting 
applications within this new area of applied science that 
would highlight its longterm potential to the broader scien- 
tific community. This was to remain a dominant theme in 
his research program for the remainder of his career. His 
vision also inspired an entire generation of students from his 
own and other laboratories, many of who continue to evolve 
the frontiers of metabolic engineering to this day. 

IMMOBILIZED ENZYME TECHNOLOGY 

When compared to their chemical counterparts, biocatalysts 
are exquisitely selective and highly reactive over a broad 
range of operating conditions. Moreover, whole microbial 
cells (primarily bacterial and fungal) and their catalytic ma- 
chinery (e.g., enzymes and metabolic pathways) can accept 
a wide array of complex molecules as substrates, yielding 
products with unparalleled chiral (enantio-), positional (re- 
gio-), and chemoselectivities. Such high selectivity affords 
efficient reactions with few byproducts, and ensuring that 
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EXHIBIT 4 



enzymes can be used in both simple and complex transfor- 
mations without the need for tedious blocking and deblock- 
ing steps that are commonplace in enantio- and regioselec- 
tive organic synthesis. 



Structure and Function of Enzymes on 
Solid Supports 

Prior to Bailey's arrival in the enzyme technology arena, the 
vast majority of biocatalysts, even for commercial applica- 
tions, involved soluble enzymes. Nonetheless, heteroge- 
neous preparations offer a variety of advantages over ho- 
mogeneous ones for scaled-up biocatalytic operation, in- 
cluding retention of the enzyme in the bioreactor, use of 
packed-bed operation, and high catalyst loading. Immobi- 
lized enzymes, however, also display different structural 
and functional properties, as well as being subject to sub- 
strate diffusional limitations that impact observed enzyme 
activity and stability. 

As early as 1977, Bailey recognized that immobilization 
could result in operational advantages for biocatalytic sys- 
tems. For example, glucose oxidase, an enzyme that is rap- 
idly inactivated by H 2 0 2 could be stabilized dramatically 
upon immobilization, even using simple covalent attach- 
ment techniques. Various supports were used, including ac- 
tivated carbon, glass, metal oxides, and polymer resins (Bai- 
ley and Cho, 1983; Cho and Bailey, 1978, 1979). These 
early studies were augmented by quantitative evaluation of 
enzyme systems on solid supports, with direct comparison 
to soluble systems. Bailey and others found that the ob- 
served reaction kinetics could be affected by the nature of 
the immobilization support. In one of the first applications 
of electron paramagnetic resonance (EPR) spectroscopy ap- 
plied to immobilized enzymes, Bailey, along with then 
graduate student Douglas Clark, demonstrated that immo- 
bilization affects the structure and dynamics of the mam- 
malian protease, a-chymotrypsin (Clark and Bailey, 1984). 
Structure-function studies were performed, which resulted 
in a firm understanding of the improvement in enzyme ac- 
tivity upon the use of linkers to attach enzymes to the sup- 
port rather than direct enzyme attachment. Simultaneous to 
this study was the ability to quantify the influence of sub- 
strate diffusional limitations on heterogeneous enzymatic 
catalysis (Clark et al., 1985; Dennis et al., 1984). This study, 
expanded many fold by others on a wide range of commer- 
cially relevant enzymes, provided a quantitative foundation 
for immobilized enzyme technology that is critical today in 
the food, pharmaceutical, and chemical industries, and in 
both aqueous and nonaqueous media. Above all else, these 
studies led to the realization that enzymes [and whole cells 
(Doran and Bailey, 1986)] were capable of being manipu- 
lated simply by the environment that they were used in, and 
therefore, the enzyme technologist was not limited to the 
native properties of enzymes for eventual operation. 



Nonaqueous Enzymology— Quantitative and 
Mechanistic Principles 

Enzymatic catalysis in organic solvents has dramatically 
shaped the emerging use of enzymes in organic synthesis 
(Dordick, 1992; Klibanov, 1990). Nonaqueous media re- 
sults in higher substrate solubility, reversal of hydrolytic 
reactions, modified enzyme specificity, and improved ther- 
mostability. Nonaqueous conditions enable one to tap into 
novel enzyme activities and selectivities heretofore only 
possible using genetic modifications or complex multistep 
pathways. Because such an approach is so attractive, enzy- 
matic catalysis in nonaqueous media has undergone rapid 
expansion, particularly over the past decade. While there is 
little question that enzymes can function in nonaqueous me- 
dia, reaction rates are typically quite low and enzymes often 
have limited stability in such environments. In nearly all 
cases, the catalytic activity displayed by enzymes in nearly 
anhydrous (or neat) organic solvents is far lower than in 
water; as much as five orders of magnitude lower! There is 
nothing unique about subtilisin", and similarly diminished 
activity of many enzymes (hydrolases and oxidases) is ob- 
served in organic solvents. Nevertheless, there may be noth- 
ing inevitable about this decline, and both its underlying 
causes and effective remedies are emerging. 

While Jay Bailey did not study biocatalysis in organic 
media, nonetheless his prior research yielded the tools nec- 
essary to make critical contributions in this area, particularly 
for studying heterogeneous enzyme preparations quantita- 
tively. For example, in the early 1990's, Affleck et al. 
(1992) and Xu et al. (1994), using enzyme kinetic, solution 
thermodynamic, and EPR spectroscopic techniques, uncov- 
ered an intriguing correlation between increased activity of 
enzymes in slightly hydrated organic solvents relative to dry 
solvents and the polarity of the enzyme's active site. Further 
quantitative studies involving pressure-based electrostric- 
tion resulted in one of the first transition-state mechanistic 
models of observed enzyme activity in nearly anhydrous 
environments. Specifically, Michels et al. (1998) used Kirk- 
wood electrostriction analysis to show that the dipole mo- 
ment of subtilisin' s transition state was the same in hexane 
as in water. Hence, it is not surprising that the enzyme is 
much less active in the former than the latter, as water can 
stabilize charge separation in the transition state that gives 
rise to such a dipole moment. Prior to this result, Khmel- 
nitsky et al. (1994) discovered that enzymes could be dra- 
matically activated (nearly 4,000-fold) by lyophilizing them 
in the presence of a nonbuffer salt such as KC1. Based on the 
work by Michels et al. (1998), it was reasoned that charge 
stabilization could be achieved by nonbuffer salts, which 
provides a locally high polarity near the transition state of 
the enzyme in nonpolar organic media, even in the absence 
of added water. Indeed, the 1990's provided a wealth of 
techniques to activate enzymes for use in organic media. 
Some of the prominent ones include the addition of polyols 
(Adlercreutz, 1993), crown ethers (Engbersen et al., 1996), 
transition-state analogs (Slade and Vulfson, 1998), and sub- 
strates and substrate mimics (Braco et al., 1990; Rich and 
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Dordick, 1997). Beginning with a fundamental picture of 
enzymes in the nonaqueous milieu, the field of nonaqueous 
enzymology has recently seen cases where enzymes func- 
tion in organic media as well as they do in water; again a 
striking realization that biocatalysis can be tailored by ma- 
nipulating an enzyme's heterogeneous environment. Such a 
result can be traced back to Jay Bailey and his initial work 
with glucose oxidase and a-chymotrypsin on solid supports. 

Activated biocatalyst formulations have begun to impact 
society in terms of human health (better drugs), the envi- 
ronment (more efficient and selective synthetic conditions), 
and industry (new routes to existing and novel chemicals 
and materials). For example, salt-activated thermolysin 
catalyzes the regioselective acylation of paclitaxel (taxol) in 
a synthetic scheme to produce a water-soluble prodrug (pa- 
clitaxel 2'-adipic acid), which may result in an easier mode 
of delivery of this anticancer compound (Khmelnitsky et al., 
1997). The adipic acid derivative of taxol is »1700-fold 
more soluble in water than the native drug. In addition to the 
direct impact on taxol prodrug synthesis, activated biocata- 
lysts are beginning to find applications in the synthesis of 
chiral molecules and in the resolution of optical isomers, 
both critical in the preparation of new pharmaceuticals. 

Nonaqueous enzymology is quickly maturing, and com- 
bined with directed evolution (Affholter and Arnold, 1999) 
and gene-shuffling technologies (Powell et al., 2001), bio- 
catalysts with tailored and controllable activities, stabilities, 
and selectivities will yield still better commercial catalysts. 
The ability to use enzymes in organic media has also led to 
the bridging of the materials and biological worlds, wherein 
enzymes and other proteins can be incorporated into poly- 
mers (organic and inorganic) to provide structure and func- 
tion to the material (Wang et al., 1997). Opportunities in 
drug discovery, through protein chip technologies (Kim et 
al., 2001), and nanotechnology (Graff et al., 2001), through 
smart materials, will have a dominating impact in society 
for years to come. All of these advances can be linked to the 
fundamental immobilized enzyme studies initiated by Jay. 

MANIPULATION OF CELLULAR FUNCTION 

Three main reasons make metabolism a particularly inter- 
esting target for manipulation by a chemical engineer. First, 
in its most basic form, metabolism represents a large set 
(network) of chemical transformations operating in series/ 
parallel in one pot. This has a direct analogy with, say, the 
chemistry in a naphtha cracker, the analysis of which is 
widely regarded as the fountainhead of modern chemical 
reaction engineering. Jay Bailey was only too aware of this, 
since his graduate training and early independent research 
focused on the theoretical and experimental analysis of 
complex reaction networks. Second, most practical limita- 
tions of cells as biocatalysts are due to metabolic constraints 
or defects. It therefore follows that the enhancement of ex- 
isting biotechnological processes as well as the develop- 
ment of new ones rests heavily on a solid appreciation of the 
nuances of the underlying metabolism. Finally, the synthetic 



capabilities of metabolism (especially anabolic metabolism) 
are unparalleled. Thus, the ability to manipulate such pow- 
erful chemistry should be able to afford a virtually unlimited 
spectrum of new products. Since Bailey was especially in- 
terested in the practical applications of academic science, 
the latter two arguments also appealed to him. 

What does a chemical engineer need to learn to tackle 
metabolic problems? Bailey thought long and hard about 
this right from the beginning. He was particularly aware that 
metabolic engineering would be unable to sustain itself as a 
burgeoning field if it simply represented a collection of 
interesting anecdotal examples involving the manipulation 
of cellular function. Indeed, the title of his landmark article 
in Science (1991) is testament to his constant desire for a 
strong foundation based on rigorously tested principles and 
cutting-edge tools. Examples of such principles and tools 
include: 

1. The chemical logic of metabolism. While not apparent 
at first glance, the chemical logic of metabolism pro- 
vides an excellent basis for organizing an overwhelm- 
ingly large body of information into a "sensible" net- 
work of chemical transformations. For example, an un- 
derstanding of how and why energy is stored in the form 
of phosphoanhydride linkages in a cell allows one to 
appreciate the logic of many phosphorylation-dephos- 
phorylation reactions that occur during central carbon 
metabolism (glycolysis and TCA cycle). Likewise, by 
understanding the power of the aldol reaction, one can 
readily recognize how the six carbon atoms of glucose 
might be scrambled to yield the carbon backbone of each 
of the 20 amino acids in a cell. When viewed through the 
eyes of chemical structure and reactivity, metabolic stoi- 
chiometry (and to some extent, even metabolic kinetics) 
are no longer "dry facts," and a rational approach can be 
taken toward strategic decisions regarding which reac- 
tions to manipulate toward a metabolic engineering ob- 
jective. 

2. The biological logic of metabolism. While many aspects 
of metabolic reaction networks are conceptually related 
to networks associated with other problems in chemical 
reaction engineering, it is the biological control of me- 
tabolism that sets this problem area apart. Concepts such 
as positive and negative regulation of enzyme activity at 
a transcriptional and/or posttranscriptional level must be 
understood well by chemical engineers before they can 
conduct a meaningful analysis of a metabolic problem. 
The fact that a living cell has devised a dazzling range of 
circuits for metabolic control, most of which have no 
parallel in nonliving systems, does not make the chal- 
lenge any easier for the chemical engineer, but a back- 
ground in molecular biology can be invaluable in this 
regard. The development of effective ways to translate 
the molecular biologist's cartoon representations of 
metabolic control into quantitative models was a recur- 
rent theme in Jay Bailey's research (see, for example, 



492 BIOTECHNOLOGY AND BIOENGINEERING, VOL. 79, NO. 5, SEPTEMBER 5, 2002 



Lee and Bailey, 1984), and remains a major challenge in 
metabolic engineering to this day. 

3. Chemical tools. Tools of modern analytical chemistry 
are especially valuable to a metabolic engineer in his 
analysis of perturbation-response experiments. Often, 
due to the highly coupled and nonlinear connections that 
exist within a metabolic network, it is not possible to 
intuitively predict the outcome of a metabolic perturba- 
tion achieved via genetic or environmental manipulation. 
In conjunction with steady-state or pulse-chase experi- 
ments, analytical tools such as GC/LC-MS, UV spec- 
troscopy, fluorescence spectroscopy, and NMR spectros- 
copy allow one to monitor changes in concentrations and 
fluxes of metabolic intermediates (Bailey et al., 1987). 
Bailey saw the power of each one of these tools, as well 
as many others, and learned to not only apply them ef- 
fectively to metabolic problems but also to innovate with 
them. 

4. Biological tools. Above all else, the toolbox of modern 
molecular biology has been the driving force for the 
emergence of metabolic engineering science and appli- 
cations. Molecular biology has provided an unparalleled 
ability to manipulate genes individually or combinatori- 
ally, and also to monitor structural and dynamic changes 
in cellular macromolecules such as DNA, RNA, and pro- 
teins. At a time when many chemical engineers were 
wary of the impact of molecular biology tools on their 
field, Bailey was one of the first chemical engineers to 
wholeheartedly embrace this toolbox and to put virtually 
every tool to use in his attack on metabolic problems. 
More recently, he recognized the value of genomic and 
proteomic tools in extending the reach of molecular bi- 
ology to multi-variable analysis of metabolism. 

5. Mathematical tools. Although metabolism is a descrip- 
tive science, metabolic engineering is most effective 
when undertaken on the basis of a quantitative frame- 
work. Herein lies a particularly vexing challenge — 
although the formalism of metabolic control theory is 
adequately robust to describe most problems in meta- 
bolic engineering, the complexity of most practical ap- 
plications in metabolic engineering preclude rigorous ap- 
plication of this theory. Major (often drastic) approxima- 
tions must be made that limit the generality of 
mathematical models. Jay Bailey experimented with the 
entire gamut of mathematical tools for quantitatively de- 
scribing and analyzing metabolic processes, ranging 
from rigorous coupled nonlinear differential equations, 
to cybernetic models based on "soft" axioms. This was a 
particularly strong passion of his during his last years. 
His final essay titled "Complex biology with no param- 
eters" provided advice as well as exhortation to present- 
day metabolic engineers (Bailey, 2001). 

In addition to developing and refining metabolic engi- 
neering principles and tools, Jay Bailey also made contri- 
butions to many important problems in applied metabolic 
engineering. Examples of problems he studied included pH 



homeostasis in microorganisms, central carbon metabolism 
in E. coli and yeast, oxidative (especially microaerobic) 
metabolism, posttranslational modifications of proteins in 
mammalian cell culture, and product-oriented niche me- 
tabolism such as solventogenic fermentation processes. 
Within his exceptionally broad research program, each of 
these problems was marked by a desire to understand, con- 
trol, and manipulate the metabolic property of interest. Bai- 
ley's choice of problems and his approach to a solution was 
characteristic, and had an enormous influence on the field. 
His work has been widely cited and will almost certainly be 
studied carefully by future generations of metabolic engi- 
neers. 

DRUG DISCOVERY 

Because of the complexity of many proven biologically ac- 
tive compounds, traditional drug discovery methods have 
begun to be supplemented by nontraditional methodologies 
that focus on either rational or combinatorial techniques for 
drug discovery and development. The former requires ex- 
tensive knowledge about a biological target (e.g., a recep- 
tor's binding site or an enzyme's active site), and utilizes 
computational chemistry and molecular modeling to design 
chemical structures that may illicit a biological response. 
Often, such molecular targets are either not available, or 
their structures are not available in sufficient resolution to 
provide suitable targets for molecular modeling. In these 
cases, drug discovery has recently turned to combinatorial 
techniques for new lead discovery and development. Within 
the past decade there has been a major growth in the appli- 
cation of engineered enzymes and whole cells to problems 
in drug discovery. Two such examples are described below. 

Combinatorial Biology 

Secondary metabolism refers to metabolic processes in a 
cell that occur during post-exponential (often stationary) 
phase of growth, and are therefore unnecessary for the sur- 
vival or reproductive capacity of a growing cell. Many mi- 
croorganisms and plants produce structurally complex natu- 
ral products as secondary metabolites; some of these have 
exquisite biological activities that have led to their exploi- 
tation as antibiotics, anticancer agents, or other pharmaco- 
logically useful agents. Examples of well known natural 
products synthesized as secondary metabolites include the 
antibiotics penicillin, streptomycin, erythromycin, tetracy- 
cline, vancomycin, the anticancer agents adriamycin, taxol 
and vinblastine, the cholesterol lowering agents compactin 
and lovastatin, and the immunosuppressants cyclosporin, 
FK506, and rapamycin. 

Although it has been recognized for many decades now 
that, notwithstanding structural distinctions, the biosyn- 
thetic pathways of many of these natural products are re- 
lated, an understanding of the associated catalytic mecha- 
nisms is only just beginning to emerge. Hand in hand with 
these fundamental insights, the application of protein engi- 
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neering principles to natural product biosynthesis has re- 
sulted in the emergence of a new field, often referred to as 
combinatorial biosynthesis , where the structure of a natural 
product is systematically manipulated by genetic manipula- 
tion of the biosynthetic enzymes. Combinatorial biosynthe- 
sis has yielded numerous new "unnatural" natural products 
over the past 10 years, and can be used to optimize the 
properties of existing and emerging bioactive natural prod- 
ucts (Rodriguez and McDaniel, 2001). In addition, it could 
also be used to construct new natural product libraries for 
drug discovery. In particular, the enzymes responsible for 
biosynthesis of four major classes of natural products — 
polyketides, nonribosomal peptides, isoprenoids, and de- 
oxysugars and related aminocyclitols — are emerging as es- 
pecially fertile targets for genetic and chemo-biosynthetic 
manipulation (Cane et al., 1998). The first products from 
such biosynthetic engineering efforts are already entering 
clinical trials. As more such engineered metabolites emerge 
from discovery into development, the molecular vision that 
Jay Bailey gave to the field of biochemical engineering will 
undoubtedly be realized. 

Combinatorial Biocatalysis for Lead 
Compound Optimization 

Rapid developments in genomics, proteomics, and combi- 
natorial chemistry have reshaped the field of drug discov- 
ery, providing new drug targets for selective screens and 
new compounds to be tested in those screens. While com- 
binatorial methods have given rise to large libraries of com- 
pounds, typically these compounds result in improved lead 
candidates that must undergo further transformations by 
conventional medicinal chemistry to yield new drug candi- 
dates. High-throughput combinatorial methodologies have 
not impacted lead optimization nearly as much as they have 
lead discovery, mainly because of the highly selective, in- 
tricate chemistries often required to optimize lead com- 
pounds. This is particularly challenging for optimization of 
natural products or complex synthetic leads, the latter often 
coming from initial combinatorial synthesis and high- 
throughput screening. 

Nature's most potent molecules are produced by enzyme- 
catalyzed reactions coupled with natural selection of those 
products with optimal biological activity. Combinatorial 
biocatalysis harnesses the natural diversity of enzymatic re- 
actions for the synthesis of organic compound libraries to 
generate biologically active compounds, which encompass 
a wide array of chemistries and structures (Michels et al., 
1998). Combinatorial biocatalysis is focused on both bio- 
catalytic transformation and iterative synthesis. The more 
complex the lead compound, the more iterations that are 
possible and the larger the library of derivatives. Hence, 
thousands of derivatives of the original lead compound can 
be produced using combinatorial biocatalysis. Initial devel- 
opment of this technology focused on the generation of 
solution-phase combinatorial libraries, including those from 
synthetic precursors (e.g., dibenzyl 1,2-phenylenedioxydi- 



acetate as bis-amide derivatives) (Adamczyk et al., 1997, as 
well as natural products such as flavonoids (e.g., bergenin), 
polyketides (e.g., doxorubicin and erythromycin) (Altreuter 
et al., 2002), nucleosides (e.g., adenosine), and diterpenoids 
(e.g., paclitaxel) (Khmelnitsky et al„ 1997) have been gen- 
erated in solution using enzymes or whole cells and their 
extracts. Together with combinatorial biology, the ability to 
tap into even a small part of nature's vast repertoire of 
biocatalytic machinery is now possible. 

FINAL THOUGHTS 

Jay Bailey trained as a chemical engineer during the heyday 
of catalytic reaction engineering as practiced in the petro- 
chemical industry. He foresaw the impact that enzymes and 
cells were likely to have on the development of new prod- 
ucts and processes, and dedicated himself to the visionary 
goal of educating new students of chemical engineering 
about the power of modern biology in this endeavor. Just a 
few short decades ago, not only were chemical engineers 
ignorant of biology, but it was difficult to make a compel- 
ling case for the relevance of biology to our discipline. The 
fact that catalysis and biocatalysis have merged so inti- 
mately in such a short timespan is a wonderful testament to 
Jay Bailey's impact on chemical engineering. Now that bi- 
ology is becoming a quantitative discipline, the impact of 
Jay Bailey extends to the biomolecular sciences, including 
the interface areas of genomics and proteomics, nanobio- 
technology, and high-throughput drug discovery. Thus, Bai- 
ley will impact biology in years to come as much as he 
impacted biochemical engineering in the past three decades. 
In 2050, as chemical engineers prepare to inaugurate the 
first zero-emission manufacturing facility that converts at- 
mospheric carbon dioxide into automotive fuel using an 
engineered multienzyme system, microorganism, or plant as 
a catalyst, they will trace the roots of their accomplishments 
back to Jay Bailey's vision for molecular bioengineering in 
very much the same way as modern aircraft designers rec- 
ognize the contributions of von Karman in the area of jet 
propulsion. 
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Dear Bernhard, 

I look forward to seeing you again at the Biochemical Engineering Conference in Salt 
Lake City. In anticipation of that, I am writing with some questions about your 
genomics-flux balance approach. 

Your recent paper in Biotechnology Progress is a very effective synopsis of general 
aspects of genomics and future challenges which I think is very useful for all 
bioengineers to see. The rest of the paper, presenting a number of case studies and 
results which are based on flux analysis, is harder for me to understand. 

As I can infer from the paper, you are somehow calculating how flux distributions 
and specific growth rates change when certain genes are not expressed. This 
calculation seems to rest on an optimization calculation, which I guess must be a 
linear programming problem, in which the fluxes are the decision variables and 
making cell mass is the objective function. How does this work exactly? What is the 
optimization problem formulation? What are the constraints besides the metabolite 
mass balances? What metabolites are assumed to enter and exit the cell, and what 
determines those rates? And, finally, a major question, is how do you arrive at 
conclusions about, say, changes in specific growth rates, given a stoichiometric 
network? How do you get from stoichiometry to rates? 

The matter of enzymes added to those implied by the genome sequence (47 of 587 
according to your BP paper) is also a question. Could you please send me a list of 
these added steps? 

I hope that you might take some time in your presentation in San Diego to explain 
some of these points in some detail. This method, if it is correct and if it works, is a 
very major advance, because it gives a formalism for determining growth rates and 
pathway rates without any knowledge of any kinetics. I must say it is little hard to 
believe that this cart be done, but maybe you have made a breakthrough. 
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Any feedback you could give me before the meeting would be much appreciated. 
Thanks very much. See you soon. 
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Robustness Analysis of the Escherichia coli Metabolic Network 
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Genomic, biochemical, and strain-specific data can be assembled to define an in silico 
representation of the metabolic network for a select group of single cellular organisms. 
Flux-balance analysis and phenotypic phase planes derived therefrom have been 
developed and applied to analyze the metabolic capabilities and characteristics of 
Escherichia coli K-12. These analyses have shown the existence of seven essential 
reactions in the central metabolic pathways (glycolysis, pentose phosphate pathway, 
tricarboxylic acid cycle) for the growth in glucose minimal media. The corresponding 
seven gene products can be grouped into three categories: (1) pentose phosphate 
pathway genes, (2) three-carbon glycolytic genes, and (3) tricarboxylic acid cycle genes. 
Here we develop a procedure that calculates the sensitivity of optimal cellular growth 
to altered flux levels of these essential gene products. The results indicate that the E. 
coli metabolic network is robust with respect to the flux levels of these enzymes. The 
metabolic flux in the transketolase and the tricarboxylic acid cycle reactions can be 
reduced to 15% and 19%, respectively, of the optimal value without significantly 
influencing the optimal growth flux. The metabolic network also exhibited robustness 
with respect to the ribose-5-phosphate isomerase, and the ribose-5-phosephate 
isomerase flux was reduced to 28% of the optimal value without significantly effecting 
the optimal growth flux. The metabolic network exhibited limited robustness to the 
three-carbon glycolytic fluxes both increased and decreased. The development 
presented another dimension to the use of FBA to study the capabilities of metabolic 
networks. 



Introduction 

Genome sequencing and bioinformatics are beginning 
to reveal the complete set of molecular components 
involved in cellular activities. Furthermore, it is also clear 
that the integrated function of biological systems involves 
complex interactions among the components that have 
been identified through bioinformatics and genomics. 
Importantly, the properties of complex systems cannot 
be predicted simply on the basis of the complete descrip- 
tion of their components, and the emergent properties 
of biological systems need to be studied {1, 2). To 
understand the complexity inherent in cellular networks, 
approaches that focus on the systemic properties of the 
network are required. The focus of such research repre- 
sents a departure from the classical reductionist ap- 
proach to the integrated approach (3) to understanding 
the interrelatedness of gene function and the role of each 
gene in the context of multigenetic cellular functions or 
genetic circuits (4, 5). 

The engineering approach to analysis and design is to 
have a mathematical or computer model, e.g., a dynamic 
simulator, of a cellular process that is based on funda- 
mental physicochemical laws and principles. There has 
been a long history of mathematical modeling of meta- 
bolic systems, which dates back to the to the mid 1960s. 
With the availability of analogue computers and the 
knowledge of metabolic regulation, dynamic simulations 
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of simple metabolic and genetic control loops appeared 
(6). The dynamic stability of such control loops became a 
focus of attention (7, 8), given the experimental observa- 
tions of oscillatory dynamics in yeast glycolysis (9). 

The systemic nature of metabolic function was appar- 
ent, and so was its complexity. However, the availability 
of enzyme kinetic information was fragmented, and 
attention turned to developing methods that could shed 
light on the relative importance of various metabolic 
events. Methods for sensitivity analysis of metabolic 
regulation began in the 1960s (10) and continued into 
the 1970s (11, 12). The results of these undertakings were 
biochemical systems theory (BST) and metabolic control 
analysis (MCA), and some useful results have been 
obtained using these approaches (13). . 

Establishing complete kinetic models of cellular me- 
tabolism became a scientific goal, whose intended use was 
to elucidate the systemic behavior of metabolic networks. 
Because of its simplicity, the human red blood cell 
represented the best opportunity to achieve this goal. 
Early metabolic models of human red blood cell metabo- 
lism appeared in the 1970s (11) and continued through- 
out the 1980s and 1990s (14-16). Insights into the 
functioning of this cell have resulted from these analyses 
(11, 17, 18). Although interesting in their own right, 
studies of red cell metabolism are not directly useful for 
organisms of industrial importance. 

While the ultimate goal is the development of dynamic 
models for the complete simulation of metabolic systems, 
the success of such approaches has been severely ham- 
pered by the current lack of kinetic information on the 
dynamics and regulation of metabolic reactions. However, 
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in the absence of kinetic information it is still possible 
to accurately assess the theoretical capabilities and 
operative modes of metabolic systems using metabolic 
flux balance analysis (FBA) (5, 19-23). FBA is based on 
the fundamental physicochemical constraints on meta- 
bolic networks. FBA only requires information regarding 
the stoichiometry of metabolic pathways and the meta- 
bolic demands; furthermore, FBA can incorporate ad- 
ditional information when it is available. FBA is par- 
ticularly applicable for post-genomic analysis, because 
the stoichiometric parameters can be defined from the 
annotated genome sequence (21). 

In a previous article, we have examined the capability 
of in silico mutant E. coli metabolic networks to support 
growth and compared the results to the wildtype. By 
using computer simulations, it was determined that 
seven metabolic reactions were essential for the aerobic 
growth of E. coll in glucose minimal media (24). The 
remaining reactions were determined to be nonessential, 
since the metabolic network maintained the capability 
to bypass simulated metabolic defects, often with little 
or no effect on the in silico maximal biomass yield. In 
this article we will further examine the essential meta- 
bolic reactions by examining the metabolic consequences 
of reduced metabolic flux carrying capacity in the es- 
sential reactions. The results indicate the redundancy 
and robustness in the function of the respective metabolic 
reactions in the metabolic network by examining the 
sensitivity of the objective function to the quantitative 
flux levels. The sensitivity analysis can provide informa- 
tion regarding the experimental measurements that are 
likely to provide the most information toward quantita- 
tively describing the metabolic network and can be used 
for in silico experimental design and assessing the value 
of the in silico predictions. 

Describing Metabolic Systems 

A metabolic network is a collection of enzymatic 
reactions that serve to biochemically process metabolites 
within the cell and transport processes that convert 
extracellular metabolites to intracellular metabolites and 
vice versa. To quantitatively describe metabolic networks, 
dynamic mass balances are written for each metabolite 
in the network, generating a system of ordinary dif- 
ferential equations that describe the transient behavior 
of metabolite concentrations: 



~d7 



J 



where vj corresponds to the jth metabolic flux, X, repre- 
sents the ith metabolite, and the stoichiometric coefficient 
S,j stands for the number of moles of metabolite i formed 
(or consumed) in reaction j. Equation 1 is particularly 
difficult to solve since the metabolic fluxes are often 
nonlinear functions of the metabolite concentrations, as 
well as a set of kinetic parameters that are difficult to 
measure or estimate. The complexity associated with 
estimating the functional relation between the metabolic 
fluxes and the metabolite concentrations and the associ- 
ated kinetic parameters has hampered the quantitative 
analysis of metabolic networks. 

Constraining Metabolic Functions 

Given the complexities associated with quantitative 
analysis of metabolic systems based on kinetic charac- 
terization of the components, we have utilized a concep- 
tually different approach to the analysis of metabolic 
networks. First, we defined fundamental physicochemical 



constraints to which the metabolic network is con- 
strained. Then, the metabolic capabilities were assessed 
subject to the imposed constraints. The capabilities are 
analyzed under the steady state assumption. It should 
be noted that steady state analysis is applicable to some 
aspects of metabolism; however, the approach will not 
be appropriate for studying all cellular processes, such 
as the cell cycle or signal transduction. Herein, we are 
interested in metabolic processes and their relation to 
cellular growth; thus the characteristic time of the 
processes is about an hour. Metabolic transients within 
the cell typically occur with time constants on the order 
of seconds to minutes (25); thus under our "window of 
observation" the metabolic network is essentially in a 
steady state and the steady-state analysis will be ap- 
propriate. The steady-state mass, energy, and redox 
balance constraints are imposed by simplifying eq 1: 



S • v = 0 (2) 

where S is the stoichiometric matrix and v is the flux 
vector. While the system is closed to the passage of 
certain metabolites, others are allowed to enter or exit 
the system via exchange fluxes (or pseudoreactions (26)). 
These fluxes do not represent biochemical conversions 
or transport processes such as those of internal fluxes 
but can be thought of as representing the inputs and 
outputs to the system. For example, the demand on a 
metabolite for further processing or incorporation into 
cellular biomass creates an exchange flux on the internal 
cellular metabolite. Thus, a distinction is made between 
internal and external metabolites in the system, therefore 
closing the material balance to all metabolites as indi- 
cated by eq 2. 

To complete the in silico representation of the meta- 
bolic network we included the constraints on the indi- 
vidual metabolic reaction fluxes due to reaction thermo- 
dynamics and the input/output characteristics of the 
network. All reversible metabolic reactions were assumed 
to have the capability to carry any metabolic flux (i.e., 
-°o < v, < °°; where Vj is the flux in reversible reactions), 
whereas irreversible metabolic reactions fluxes were 
restricted to be positive (i.e., 0 > Vj> °°; where v,- is the 
flux in irreversible reactions). Although constraints on 
the internal fluxes were defined as infinite, the magni- 
tude of each flux in the optimal solution was examined 
and compared to measured fluxes (27, 28). The revers- 
ibility of each reaction in the metabolic network was 
determined case by case on the basis of the literature 
and compared to the EcoCyc database (29). The metabolic 
enzymes identified in the complete E. coli K12 genome 
sequence and the online databases (29-31) were used 
to reconstruct the metabolic network (see supplementary 
information at http://gcrg.ucsd.edu/supplementary_data/ 
BP2000/main.htm). It should be noted that there are 
instances where the same enzyme can catalyze multiple 
reactions (e.g., different substrates or cofactors), and this 
situation was considered by including all reactions cat- 
alyzed by an enzyme as a separate column in the stoi- 
chiometric matrix. The details of this metabolic recon- 
struction have been described elsewhere (24). Addition- 
ally, constraints were placed on the exchange fluxes to 
indicate the environmental conditions. For example, met- 
abolites not available to the cell are constrained to not 
enter the cell: — -> < b : < 0, where b t (influx defined as 
positive) is the exchange flux for a metabolite ' not 
available in the simulated environment. It should be 
noted that all metabolites that have the capability to 
leave the cell always had unconstrained metabolic fluxes 
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in the net outward direction, whereas the influx con- 
straints were defined by the simulated environmental 
conditions. For the analysis herein, the exchange flux for 
inorganic phosphate, ammonia, carbon dioxide, oxygen, 
sulfate, potassium, and sodium were unconstrained, 
whereas the uptake of the carbon source was constrained 
as specified. 

Demands on the Metabolic Network 

Under changing substrate/supply conditions metabolic 
networks are continuously faced with a balanced set of 
biosynthetic demands (i.e., production of amino acids, 
nucleotides, phospholipids, as well as energy and redox 
potential). Effectively this means that the network must 
generate a balanced set of metabolites that are used to 
produce biomass. The biosynthetic demands for growth 
were determined from the biomass composition of E. coli 
(32, 33), and a metabolic flux, defined as the growth flux 
(^growth), utilizes the biosynthetic precursors in the ap- 
propriate ratios so as to generate biomass: 

%d r X f ^~ Biomass 

where dj (mmol ■ g-dry weight (DW)" 1 ) is the E. coli 
biomass composition of metabolite i. One gram of biomass 
is produced per unit flux in the growth flux, v growt h, and 
if the fluxes are represented with a basis of 1 g-DW • h 
(22), the growth flux is equivalent to the growth rate. 
The biomass composition is not constant but depends on 
the growth rate and the growth conditions (33). However, 
we have assumed that the biomass composition is con- 
stant since it has been shown that the optimal solution 
is not sensitive to the biomass composition (34), and this 
observation is also true for our system. 

In addition to the biosynthetic demands on the meta- 
bolic network, we have also imposed maintenance re- 
quirements on the metabolic system. The maintenance 
requirements included were for growth-associated and 
non-growth-associated maintenance. We imposed a 
growth-associated maintenance of 23 mmol ATP • g-DW -1 
and a non-growth-associated maintenance of 5.87 mmol 
ATP • g-DW" 1 • h" 1 (35). 

Exploring the Metabolic Capabilities 

The constraints on the metabolic network define the 
boundaries within which the metabolic system must 
operate. The mass, energy, and redox balance constraints 
are imposed by the linear homogeneous set of equations 
(eq 2). The nullspace of the stoichiometric matrix, S, 
contains all flux vectors that satisfy the mass, energy, 
and redox balance constraints (36). However, there are 
additional physicochemical constraints on the metabolic 
network, such as the thermodynamic constraints and the 
capacity constraints on the exchange fluxes, which are 
enforced by linear inequalities. The simultaneous en- 
forcement of all the metabolic constraints defines a 
region, the feasible set, that contains all feasible metabolic 
flux vectors. The feasible set is not a vector space as is 
the nullspace, as a result of the linear inequality con- 
straints. Importantly, the feasible set defines the meta- 
bolic capabilities of the system. The performance capa- 
bilities of any metabolic network reside in the feasible 
set. In fact, the answer to any question related to the 
general structure and fitness of the network lies with this 
region. While the feasible set offers a convenient way of 
defining metabolic capabilities, the question arises, how 
do we best explore the specific functions of a metabolic 
network? 



One approach that has been used to explore the 
relationship between the metabolic genotype and phen- 
otype for a number of organisms is linear optimization 
(19, 21, 22, 37). Linear optimization was used to deter- 
mine the optimal flux distributions within a network so 
as to maximize/minimize a particular objective function. 
A linear programming problem is defined as follows, 
where a linear objective function is maximized or mini- 
mized subject to a series of linear equality and inequality 
constraints: 

Maximize/Minimize Z = cjVj 
subject to SjjVj =0, a.j< Vj< fij (5) 

The linear programming formalism is analogous to the 
system of linear equalities/inequalities that form the 
constraints on the metabolic network. The objective 
function, Z, is defined by assigning the appropriate values 
to the c vector; herein, the c vector was taken as the unit 
vector in the direction of the growth flux. We used the 
reduced costs from the linear programming solution to 
identify alternate optimal solutions. In metabolic engi- 
neering applications, the objective function can cor- 
respond to a number of diverse objectives, such as 
maximizing energy or metabolite production (20). How- 
ever, regardless of the objective function the optimal 
solution will lie within the feasible set that is defined by 
the physicochemical constraints placed on the system. 

The utilization of linear programming to examine 
metabolic networks defines the optimal flux vector that 
maximizes (or minimizes) an objective function and 
satisfies the entire set of constraints. The utilization of 
design related objectives (such as maximizing the pro- 
duction of an amino acid) can be used to guide genetic 
engineering of a strain for metabolite overproduction. 
Herein, we have employed a physiologically realistic 
objective, the maximization of the growth flux. We have 
assumed that the cell has evolved the regulatory mech- 
anisms to operate optimally within the feasible set. The 
feasible set defines the capabilities of the metabolic 
network, and all metabolic flux vectors within the feasible 
set satisfy the imposed physicochemical constraints. 
Therefore, theoretically all flux vectors within the feasible 
set can be reached by adjusting the enzyme kinetic 
parameters and gene regulation. The enzyme kinetics 
and gene regulation constraints on the metabolic system 
will be referred to as system specific constraints. We 
assume that the cell has found the optimal set of system 
specific constraints through the course of evolution, and 
we attempt to find the same solution using linear 
programming. The assumption has been experimentally 
examined under a limited number of conditions, and 
under defined conditions with a single carbon source, the 
experimental data is consistent with the optimal utiliza- 
tion of the metabolic network (27). 

Phenotype Phase Plane Analysis 

Flux balance analysis can be used to examine the 
metabolic network in detail. Optimal solutions to the 
linear programming problem will then lie on a vertex of 
the feasible set, which is a polyhedron (38). All the 
metabolic flux vectors (or metabolic phenotypes) attain- 
able from a defined metabolic genotype are mathemati- 
cally confined to the feasible set. Linear programming 
was used to search through the feasible set for a solution 
that maximizes the growth flux. Experimental data for 
the growth of E. coli under nutritionally rich growth 
conditions (i.e., cell is not starved for phosphate, nitrogen, 



Biotechnol. Prog., 2000, Vol. 16, No. 6 



etc.) is consistent with the optimal utilization of the 
metabolic network (27); thus, defining the growth flux 
as the objective function produces physiologically mean- 
ingful results. However, the optimal flux distribution is 
only meaningful when interpreted in terms of the specific 
environmental conditions. Therefore, phenotype phase 
planes (39) have been developed to define the range of 
optimal flux vectors and how the optimal flux vector is 
dependent on the environmental conditions. 

The methodology for defining PhPPs has been de- 
scribed (39). We will now briefly describe the construction 
of PhPPs. Two metabolic fluxes can form two axes on an 
(x, y) -plane (these metabolic fluxes were two unit vectors 
in R n ). The optimal metabolic flux distribution is calcu- 
lated for all points in this plane. In other words, the 
maximum value of the objective function is found as the 
position of the hyperplanes that bound the feasible set 
in the respective directions is moved. It has been deter- 
mined that there are a finite number of fundamentally 
different optimal metabolic flux distributions (or basis 
solutions in linear programming terminology) present in 
such a plane. The demarcations on the phase plane were 
defined by a shadow price (LP dual variable) analysis 
(40). This procedure leads to the definition of distinct 
regions, or "phases", in the plane, in which the optimal 
use of the metabolic network is fundamentally different, 
corresponding to different optimal phenotypes. 

Robustness Analysis 

Robustness, defined here with respect to metabolic 
networks, is a measure of the change in the maximal flux 
of the objective function (the growth flux was defined as 
the objective) when the optimal flux through any par- 
ticular metabolic reaction is changed. The robustness 
characteristics of the metabolic network were determined 
by calculating the optimal flux vector so as to maximize 
the growth flux (with only external flux constraints), this 
flux was called the in silico wildtype flux. Then the flux 
through the reaction in question was reduced from 100% 
to 0% of the in silico wildtype flux and the objective 
function was calculated. Additionally, the in silico wild- 
type flux was increased from the wildtype value and the 
upper bound on increasing the flux level was the maximal 
allowable flux in the reaction or the flux level for which 
the objective function was reduced to zero. The calcula- 
tions were for a simulated aerobic batch culture in 
glucose minimal media. 

The FBA framework was used to address the systemic 
effect on the metabolic network of increased and de- 
creased (with respect to the in silico wildtype) metabolic 
flux. Herein, we quantified the robustness of the meta- 
bolic network to flux changes in the essential enzymatic 
reactions. The essential enzymes (for growth on glucose 
minimal media) were previously identified through an 
in silico analysis (24). Seven enzymatic reactions in 
central metabolism (Figure 1) were found to be es- 
sential: the transketolase (TKT), ribose-5-phosphate 
isomerase (RPI), two enzymes (GAP, PGK) in the 3-car- 
bon stage of glycolysis (3CG), and the first three enzymes 
(GLT, ACN, ICD) of the TCA cycle. Below, the robustness 
characteristics of the metabolic network with respect to 
alterations of the flux levels of these essential metabolic 
reactions will be investigated. We will utilize phenotype 
phase planes (PhPPs) to define points where the optimal 
utilization of the metabolic network changes due to 
capacity constraints on the essential enzymatic reactions. 

Transketolase. The transketolase (TKT) catalyzes an 
essential enzymatic reaction in the pentose phosphate 
pathway (PPP) (41). However, tkt mutant strains have 



been shown to grow on glucose minimal media with low 
TKT residual activity (3% of wildtype) (42, 43). The 
ability of the metabolic network to support growth with 
a large reduction in TKT flux was investigated in silico 
by continuously restricting the metabolic flux in the TKT 
reactions. As the maximum allowable flux through the 
TKT reactions was reduced from the in silico wildtype, 
it was determined that the ability of the metabolic 
network to support growth was virtually unchanged for 
enzymatic fluxes as low as 15% of the in silico wildtype 
(Figure 2). The response to decreased TKT metabolic flux 
was found to have two qualitatively different regions. The 
regions were identified in the PhPP (Figure 3). 

The PhPP describing the changes in the metabolic 
pathway utilization as a function of the TKT flux and 
the glucose uptake rate was calculated (Figure 3). The 
optimal relation between the glucose uptake rate and the 
TKT flux was determined from the PhPP (Figure 3). It 
was determined that there were two qualitatively differ- 
ent regions of metabolic pathway utilization for TKT 
fluxes lower than optimal, and these regions were defined 
as A and B (as shown in Figure 3). Furthermore, there 
were determined to be six qualitatively different regions 
for TKT fluxes greater than optimal, and these regions 
were defined as 1-6 (as shown in Figure 3). The maximal 
growth flux (normalized to the in silico wildtype) was 
calculated for all TKT fluxes from zero to the maximum 
allowable flux (the glucose uptake exchange flux was 
constrained to 10 mmol g-DW -1 h -1 ) that still permits 
cellular growth, and the results are shown in Figure 2. 

In region A (above 15% of the in silico wildtype enzyme 
flux), the optimal value of the growth flux was hardly 
changed, and at the demarcation between regions A and 
B, the growth flux was decreased to 99.2% of the in silico 
wildtype. However, to cope with the decreased TKT 
metabolic flux carrying capacity, shifts in the metabolic 
pathway utilization occurred (Figure 4). The redox po- 
tential (NADPH) requirement for biosynthetic demands 
was achieved by a flux redistribution that resulted in the 
utilization of the transhydrogenase that converted NADH 
(produced from an increased TCA cycle flux, Figure 4C) 
into NADPH. The flux diverted from the PPP (Figure 3B) 
resulted in increased glycolytic fluxes, in particular the 
pyruvate kinase and the phosphoglucoisomerase fluxes 
(Figure 3A). In this region, the optimal growth flux was 
not sensitive to changes in the TKT flux. However, the 
optimal flux in several metabolic processes were sensitive 
to the TKT flux in this region (transhydrogenase, PYK, 
PGI, TCA cycle flux) 

In the second region (region B) of reduced TKT flux 
(enzyme flux less than 15% of the in silico wildtype) , the 
metabolic network was limited in the ability to produce 
the essential biosynthetic precursor, erythrose 4-phos- 
phate. In this region, the optimal growth flux was 
sensitive to the flux level in the TKT reaction. The 
metabolic fluxes in this region are not shown in Figure 
4 because alternate optimal solutions exist. Cellular 
growth is solely limited by the availability of a single 
biosynthetic precursor, the excess glucose can be con- 
verted to any of the metabolic byproducts with the same 
value of the objective function. Furthermore, the excess 
high-energy phosphate bonds can be eliminated in any 
futile cycle; thus alternate optimal solutions exist. 

The effect on the metabolic network due to TKT fluxes 
increased beyond the optimal flux for growth was also 
examined. An increase in metabolic flux may result from 
the overexpression of the respective gene, and the robust- 
ness analysis can be used to identify the constraints on 
flux changes due to the integrated metabolic network. 
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Figure 1. The central metabolic pathway reactions. Reactions: aceA, isocitrate lyase; aceB, malate synthase; aceEF, pyruvate 
dehydrogenase; ack, acetate kinase; acn, aconitase; adh, acetaldehyde dehydrogenase; eno, enolase; fba, fructose- 1,6-bisphosphatate 
aldolase; fbp, fructose-l,6-bisphosphatase; frd, fumurate reductase; fum, fumarase; gap, glyceraldehyde-3-phosphate dehydrogenase; 
glk, glucokinase; git, citrate synthase; gnd, 6-phosphogluconate dehydrogenase; gpm, phosphoglycerate mutase; icd, isocitrate 
dehydrogenase; ldh, lactate dehydrogenase; mae, malic enzyme; mdh, malate dehydrogenase; pck, phosphoenolpyruvate carboxykinase; 
pfk, phosphofructokinase; pfl, pyruvate formate lyase; pgi, phosphoglucose isomerase; pgk, phosphoglycerate kinase; pgl, 6-phos- 
phogluconolactonase; ppc, phosphoenolpyruvate carboxylase; pps, phosphoenolpyruvate synthase; pts, phosphotransferase system; 
pyk, pyruvate kinase; rpe, ribulose phosphate 3-epimerase; rpi, ribose-5-phosphate isomerase; sdh, succinate dehydrogenase; sfc, 
malic enzyme; sucAB, 2-ketoglutarate dehyrogenase; sucCD, succinyl-CoA synthetase; tal, transaldolase; tkt, transketolase; tpi, 
triosphosphate isomerase; zwf, glucose 6-phosphate-l-dehydrogenase. Metabolites: 2PG, 2-phosphoglycerate; 3PG, 3-phosphoglycerate; 
6PG, D-6-phosphate-gluconate; 6PGA, D-6-phosphate-glucono-<5-lactone; AC, acetate; AcCoA, Acetyl-CoA; a-KG, a -ketoglutarate; 
CIT, citrate; DHAP, dihydroxyacetone phosphate; DPG, 1,3-bis-phosphoglycerate; E4P, erythrose 4-phosphate; ETH, ethanol; F6P, 
fructose 6-phosphate; FDP, fructose 1,6-diphosphate; FOR, formate; FUM, fumarate; G6P, glucose 6-phosphate; GA3P, glyceraldehyde 
3-phosphate; ICIT, isocitrate; LAC, lactate; MAL, malate; PEP, phosphoenolpyruvate; PYR, pyruvate; R5P, ribose 5-phosphate; Ru5P, 
ribulose 5-phosphate; S7P, sedo-heptulose; SUCC, succinate; SuccCoA, succinyl CoA; X5P, dihydroxyacetone phosphate. 

precursors, rather than redox potential. Optimally, in 
region 3, the glucokinase reaction was operative in 
glucose utilization, and this allowed for a more efficient 
flow of the metabolites into the PPP due to the increased 
TKT flux. Regions 4 and 5 were similar with respect to 
the metabolic pathway utilization, the glyoxylate bypass 
was no longer utilized in the optimal solution and the 
PFL reaction optimally carried a small flux. Finally, in 
region 6, redox potential was overproduced. This region 
was characterized by alternate optimal solutions to 
eliminate the excess high-energy phosphate bonds. How- 
ever, there were no metabolic byproducts produced (other 
than CO2); this was because many metabolites were still 
desirable to the cell (as identified through a shadow price 
analysis {40)). In this region, the optimal oxygen uptake 
rate was very high (~60 mmol g-DW" 1 h" 1 ) and it is likely 
that the maximal TKT flux is much lower due to other 
constraints on the metabolic network that were not 



The optimal growth flux as a function of the flux in the 
TKT reaction was calculated (Figure 2, insert). The 
metabolic flux was continuously increased in silico from 
the in silico wildtype value to the maximum flux that 
permits growth. Six qualitatively different patterns of 
metabolic pathway utilization (numbered 1 -6 in Figure 
3) were observed when the TKT flux was increased 
beyond the in silico wildtype. 

The region 1 of Figure 3 the optimal metabolic flux 
vector was characterized by an increased PPP flux, an 
active transhydrogenase reaction, a decreased PYK flux, 
and a decreased TCA cycle flux. In region 2 the TCA cycle 
flux was further decreased while the PPP flux was 
increased, and optimally, the glyoxylate bypass was 
utilized to replenish the TCA cycle biosynthetic precur- 
sors thus reducing the PPC flux (Figure 4). At the 
demarcation between regions 2 and 3, the TCA cycle was 
shut off and functioned to produce the biosynthetic 
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Figure 3. The glucose uptake rate (mmol • g-DW -1 • h _1 )-transketolase flux (substrates converted • g-DW" 1 • h -1 ) phenotype phase 
plane. Exchange flux constraints were defined as discussed in the text. The regions are numbered and lettered. The numbered 
regions correspond to TKT fluxes that are increased relative to the optimal value. The optimal relation is the thick demarcation line. 
The lettered regions identify TKT flux reductions below the optimal relation. The metabolic fluxes along the thick vertical Tine 
(glucose uptake rate = 10 mmol • g-DW _1 • h -1 ) are shown in Figure 4. 



included in the analysis (such as oxygen mass transfer 
limitations). 

Ribose-5-Phosphate Isomerase. The ribose-5-phos- 
phate isomerase reaction (RPI) also catalyzes an essential 
reaction in the PPP (41) for growth in glucose minimal 



media. Similarly to tkt mutants, rpi mutants have been 
shown to grow with enzymatic activity much less than 
that of the wildtype. For example, Skinner and Cooper 
have isolated a strain with RPI activity below 10% of the 
wildtype, and this strain was able to grow (44). As the 



Biotechnol. Prog., 2000, Vol. 16, No. 6 
(A) 




Figure 4. Optimal intracellular fluxes in the central metabolic pathways (substrates converted • 
TKT metabolic flux constraint (substrates converted • g-DW -1 • h _1 ). The glucose uptake rate was 
• h _1 . (A) Glycolytic fluxes. (B) Pentose phosphate pathway fluxes. (C) TCA cycle fluxes. 



maximum allowable RPI flux was reduced from the in 
silico wildtype, it was determined that the ability of the 
metabolic network to support growth was virtually 
unchanged for enzymatic fluxes as low as 28% of the in 
silico wildtype (Figure 2). The RPI -glucose uptake rate 
PhPP was calculated to characterize the effect of altered 
RPI metabolic fluxes (not shown, similar to Figure 3). 
The qualitative effect of reduced flux in the RPI reaction 
from the in silico wildtype was investigated with FBA, 
and the holistic metabolic response to decreased and 
increased RPI fluxes was similar to the TKT results 
because the effect on the PPP was similar. 

3-Carbon Glycolysis. The 3-carbon glycolytic reac- 
tions that the in silico analysis predicted to be essential 
have been shown experimentally to be required for the 
growth of E. coli on a glucose minimal media (GAP, PGK) 
(41). The glycolytic essential reactions were subjected to 
a robustness analysis to investigate the optimal systemic 
effect of flux alteration. The ability of the metabolic 
network to support growth with a reduction in 3CG flux 
was investigated in silico by continuously restricting the 
3CG flux. As the allowable flux through the 3CG reac- 



tions was reduced from the in silico wildtype, it was 
determined that the sensitivity of the growth flux was 
increased compared to the other essential reactions. 
When the 3CG flux was reduced below about 70% of the 
in silico wildtype, the growth flux was sensitive to the 
3CG flux (Figure 2). Furthermore, the 3CG fluxes could 
only be increased to 1 10% of the in silico wildtype before 
severe limitations in the growth flux were encountered 
(Figure 2). We have investigated the metabolic response 
to 3CG flux level alterations by a phenotype phase plane 
analysis (Figure 5). 

The PhPP describing the changes in the metabolic 
pathway utilization as a function of the 3CG flux and 
the glucose uptake rate was calculated (Figure 5). The 
optimal relation between the glucose uptake rate and the 
3CG flux was determined from the PhPP (Figure 5). It 
was determined that there were six qualitatively differ- 
ent regions of metabolic pathway utilization for 3CG 
fluxes lower than optimal, and these regions were defined 
as A— F (as shown in Figure 5). Furthermore, there were 
determined to be two qualitatively different regions for 
3CG fluxes greater than optimal, and these regions were 
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Figure 5. The glucose uptake rate (mmol • g-DW -1 • h -1 )-3-carbon glycolytic flux (substrates converted • g-DW -1 • h -1 ) phenotype 
phase plane. Exchange flux constraints were defined as discussed in the text. The regions are numbered and lettered. The numbered 
regions correspond to 3CG fluxes that are increased relative to the optimal value. The optimal relation is the thick demarcation line. 
The lettered regions identify 3CG flux reductions below the optimal relation. The metabolic fluxes along the thick line (glucose 
uptake rate = 10 mmol ■ g-DW -1 • h -1 ) are shown in Figure 6. 



defined as 1-2 (as shown in Figure 5). The maximal 
growth flux (normalized to the in silico wildtype) was 
calculated for all 3CG fluxes from zero to the maximum 
allowable flux (the glucose uptake exchange flux was 
constrained to 10 mmol g-DW -1 h" 1 ) that still permits 
cellular growth, and the results are shown in Figure 2. 

The optimal relation between the glucose uptake and 
the 3CG flux was calculated (Figure 5). The sensitivity 
of other optimal fluxes in the metabolic network upon 
the reduction of the 3CG flux was examined (Figure 6). 
With 3CG flux reduction just below the optimal value, 
the optimal metabolic network operation was character- 
ized by region A (Figure 5). In region A, the 3CG flux 
reduction led to increased PPP fluxes and the transhy- 
drogenase was used (Figure 6B); additionally, the TCA 
cycle flux was reduced (Figure 6C). The reduced 3CG flux 
also led to the reduction of the PYK flux, which was 
optimally completely inactivated at the demarcation 
between regions A and B (Figure 6A). In region B, the 
glyoxylate bypass was optimally utilized and the TCA 
cycle fluxes were further reduced. At the demarcation 
between region B and C, the TCA cycle no longer 
operated cyclically but rather served to generate the 
biosynthetic precursors. In region C, glucokinase was 
included in the optimal flux vector, the inclusion of the 
glucokinase decoupled the phosphoenolpyruvate to pyru- 
vate biochemical conversion and the uptake of glucose, 
thus allowing for the 3CG flux to be decreased and with 
little effect on the maximal growth flux. The growth flux 
at the demarcation between region C and D was 98% of 
the in silico wildtype (Figure 2). 

Regions D and E were very similar with respect to the 
optimal metabolic flux vector. In these regions, the pyru- 
vate— formate lyase was optimally active and the glyoxyl- 
ate bypass was no longer included in the optimal flux 
vector. Additionally, in regions D and E, the growth flux 
was more sensitive (compared to regions A-C) to the 3CG 
flux, and at the demarcation between regions E and F 
the maximal growth flux was 95% of the in silico wild- 
type. 

In the final region of reduced 3CG flux (region F, met- 
abolic flux less than 63% of the in silico wildtype) , the 



metabolic network was limited in the ability to produce 
the essential biosynthetic precursors below the block in 
the metabolic network. In this region, the optimal growth 
flux was sensitive to the 3CG flux, and the maximal 
growth flux linearly decreased to zero as the 3CG flux 
was reduced to zero from the region E,F boundary. The 
metabolic fluxes in this region are not shown in Figure 
6 because alternate optimal solutions exist. Cellular 
growth is limited by the availability of the biosynthetic 
precursors after the metabolic blockage, and the diversion 
of the flux from glycolysis to the PPP resulted in excess 
high-energy phosphate bonds and redox potential. The 
growth flux in region F is dependent upon increased oxy- 
gen availability to eliminate the excess redox potential. 
An additional constraint was imposed on the metabolic 
network, i.e., the oxygen uptake was constrained below 
a physiologically realistic value of 20 mmol g-DW -1 h -1 , 
and the feasible set did not contain a growth flux for 3CG 
fluxes below about 40% of the in silico wildtype (not 
shown). Thus, the partial inhibition of the 3CG fluxes 
can theoretically prevent the growth of E. colt however, 
growth can be maintained with reduced glucose uptake 
rates. 

The holistic effect of increased 3CG flux on the meta- 
bolic networks capability to support cellular growth was 
assessed with FBA. The optimal growth flux as a function 
of the flux in the 3CG flux was calculated (Figure 2). The 
metabolic flux was continuously increased in silico from 
the in silico wildtype value to the maximum flux that 
permits growth, and two qualitatively different metabolic 
flux vectors (numbered 1 and 2 in Figure 5) were 
observed. 

In region 1 (Figure 5), the optimal metabolic flux vector 
was characterized by a decreased pentose phosphate 
pathway (PPP), an active transhydrogenase reaction, an 
increased PYK flux, and an increased TCA cycle flux. 
However, region 1 only extends to a 3CG flux of 1 10% of 
the in silico wildtype (with a glucose uptake of 10 mmol 
g-DW -1 h -1 ), and region 2 of Figure 5 was characterized 
by alternate optimal flux distributions. The metabolic 
network was limited in the ability to produce the es- 
sential biosynthetic precursors before the effected reac- 




Figure 6. Optimal intracellular fluxes (substrates converted • g-DW -1 • h" 1 ) in the central metabolic pathways as a function of the 
3CG metabolic flux constraint (substrates converted • g-DW -1 • h" 1 )- The glucose uptake rate was constrained to 10 mmol g-DW -1 
h-'. (A) Glycolytic fluxes. (B) Pentose phosphate pathway fluxes. (C) TCA cycle fluxes. 



tions of 3-carbon glycolysis. In this region, the optimal 
growth flux was sensitive to the 3CG flux. The metabolic 
fluxes in this region are not shown in Figure 6 because 
alternate optimal solutions exist. The excess pyruvate 
produced by the elevated 3CG flux can be converted to 
any of the metabolic byproducts with the same value of 
the objective function. Furthermore, the excess high- 
energy phosphate bonds can be eliminated in any futile 
cycle, and thus alternate optimal solutions exist. 

TCA Cycle. The initial three fluxes of the TCA cycle 
were determined to be essential. The deletion of any of 
these enzymatic activities resulted in a glutamate re- 
quirement. This requirement has been shown experi- 
mentally (45). The in silico robustness analysis was 
performed to assess the effect of decreased (and in- 
creased) flux entering the TCA cycle (Figures 2, 7, and 
8). The ability of the metabolic network to support growth 
with a reduction in the TCA cycle flux was investigated 
in silico by continuously restricting the citrate synthase 
flux, which we will refer to as the TCA cycle flux. As the 
TCA cycle flux constraint was reduced from the in silico 
wildtype, it was determined that the ability of the 



metabolic network to support growth was not sensitive 
to the TCA cycle flux above 18% of the in silico wildtype. 
Furthermore, the TCA cycle flux could be increased to 
about 160% of the in silico wildtype before severe 
limitations in the growth flux were encountered (Figure 
2). We have investigated the metabolic response to TCA 
cycle flux level alterations by a phenotype phase plane 
analysis (Figure 7). 

The PhPP describing the changes in the metabolic 
pathway utilization as a function of the TCA cycle flux 
and the glucose uptake rate was calculated (Figure 7). 
The optimal relation between the glucose uptake rate and 
the TCA cycle flux was determined from the PhPP 
(Figure 7). It was determined that there were four 
qualitatively different regions of optimal metabolic path- 
way utilization for TCA cycle fluxes lower than optimal, 
and these regions were defined as A-D (as shown in 
Figure 7). Furthermore, there were determined to be four 
qualitatively different regions for TCA cycle fluxes greater 
than optimal, and these regions were defined as 1-4 (as 
shown in Figure 7). The maximal growth flux (normalized 
to the in silico wildtype) was calculated for all TCA cycle 
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Figure 7. The glucose uptake rate (mmol • g-DW -1 • h _1 )-TCA cycle flux (substrates converted • g-DW -1 • h" 1 ) phenotype phase 
plane. Exchange flux constraints were defined as discussed in the text. The regions are numbered and lettered. The numbered 
regions correspond to TCA cycle fluxes that are increased relative to the optimal value. The optimal relation is the thick demarcation 
line. The lettered regions identify TCA cycle flux reductions below the optimal relation. The metabolic fluxes along the thick line 
(glucose uptake rate = 10 mmol • g-DW" 1 • h~ l ) are shown in Figure 8. 



fluxes from zero to the maximum allowable flux (The 
glucose uptake exchange flux was constrained to 10 mmol 
g-DW -1 h" 1 ) that still permits cellular growth, and the 
results are shown in Figure 2. 

The optimal relation between the glucose uptake and 
the TCA cycle flux was calculated, and the sensitivity of 
the optimal fluxes in central metabolism to the TCA cycle 
flux was examined (Figure 8). Region A defines the 
optimal set of metabolic reactions that are utilized with 
a reduction of the TCA cycle flux below the optimal value 
(Figure 7). In region A, the reduction of the TCA cycle 
flux led to increased PPP fluxes and the transhydroge- 
nase was used; additionally, the glycolytic flux was 
decreased. The reduced TCA cycle flux led to the reduc- 
tion of the PYK flux, which was optimally completely 
inactivated at the demarcation between regions A and 
B. In region B, the glyoxylate bypass was optimally 
utilized and the glycolytic fluxes were further reduced. 
At the demarcation between region B and C, the TCA 
cycle fluxes were reduced to the point that the TCA cycle 
no longer operated cyclically but rather served to gener- 
ate the biosynthetic precursors. In region C, glucokinase 
was included in the optimal flux vector, and the inclusion 
of the glucokinase decoupled the phosphoenolpyruvate 
to pyruvate biochemical conversion and the uptake of 
glucose. The growth flux at the demarcation between 
region C and D was 98% of the in silico wildtype (Figure 
2) and the TCA cycle flux was 18% of the in silico 
wildtype TCA cycle flux. 

In the final region of reduced TCA cycle fluxes (meta- 
bolic flux less than 18% of the in silico wildtype), the 
metabolic network was limited in the ability to produce 
a-ketoglutarate, an essential biosynthetic precursor. In 
this region, the optimal growth flux was sensitive to the 
TCA cycle flux, and the characteristic behavior was 
similar to region F of the 3CG-glucose uptake rate PhPP 
(Figure 5) that was discussed above. 

The holistic effect of increased TCA cycle flux on the 
metabolic networks' capability to support cellular growth 
was assessed with FBA. The optimal growth flux as a 
function of the flux in the TCA cycle flux was calculated 
(Figure 2). The metabolic flux was continuously increased 



in silico from the in silico wildtype value to the maximum 
flux that permits growth, and four qualitatively different 
metabolic flux vectors (numbered 1-4 in Figure 7) were 
observed. 

The region 1 of Figure 7 the optimal metabolic flux 
vector was characterized by an active transhydrogenase 
reaction, an increased PYK flux, an increased glycolytic 
flux, and a decreased PPP (which was optimally inacti- 
vated at the demarcation between regions 1 and 2). 
Regions 2 and 3 are very similar with respect to the set 
of metabolic reactions that are optimally utilized, and in 
these regions, the PPP is optimally inactivated. At the 
demarcation between regions 3 and 4, the maximal 
growth flux was about 95% of the in silico wildtype and 
the TCA cycle flux was increased to approximately 160% 
of the in silico wildtype. With TCA cycle flux increases 
beyond region 3, region 4 is encountered (Figure 7). 
Region 4 was characterized by alternate optimal flux 
distributions. The metabolic network was limited in the 
ability to produce the essential glycolytic and PPP 
biosynthetic precursors. In this region, the optimal 
growth flux was sensitive to the TCA cycle flux, and the 
metabolic fluxes in this region are not shown in Figure 
8 because alternate optimal solutions exist. 

Discussion 

We have illustrated, with the complete E. coli meta- 
bolic network, how optimal metabolic phenotypes (flux 
vectors) and shifts in metabolic behavior can be analyzed 
and interpreted in silico. From the fundamental physi- 
cochemical constraints on the metabolic network, the 
feasible set that identifies the capabilities of the meta- 
bolic network was identified. Subsequently, a linear 
optimization routine was utilized to search the feasible 
set for a flux vector that maximizes a given objective func- 
tion. Given the complexity associated with developing 
complete dynamic modeling of cellular processes, the 
constraining approach, as discussed herein, is a particu- 
larly useful alternative approach to metabolic systems 
analysis. The results presented herein are of fundamental 
interest for several reasons. First, the ability to define 
essential genes under various conditions will have many 



Biotechnol. Prog., 2000, Vol. 16, No. 6 
(A) 




Figure 8. Optimal intracellular fluxes (substrates converted • g-DW -1 • h _1 ) in the central metabolic pathways as a function of the 
TCA cycle metabolic flux constraint (substrates converted • g-DW -1 • h -1 ). The glucose uptake rate was constrained to 10 mmol 
g-DW" 1 h->. (A) Glycolytic fluxes. (B) Pentose phosphate pathway fluxes. (C) TCA cycle fluxes. 



practical applications. Second, the results presented 
discussed the sensitivity of the objective function to 
specific fluxes in the metabolic network. Finally, the 
results demonstrate the potential capabilities of in silico 
analysis of cellular systems. Understanding the relation 
between individual fluxes and the holistic function of the 
metabolic network is essential to successfully metabolic 
engineering a living system, and FBA provides a meth- 
odology that can be used to direct the metabolic engineer. 

First, we should address the assumptions associated 
with the utilization of linear optimization to identify the 
optimal flux vector. The metabolic constraining formal- 
ism that we have discussed is based on the fundamental 
physicochemical constraints that all cells must abide to. 
Within the set of constraints, the cell will choose a flux 
vector for which to operate. We have attempted to find 
the same flux vector by employing linear optimization, 
and the assumption is that the cell has evolved the regu- 
latory mechanisms to find the optimal solution within 
the physicochemical constraints to maximize its survival. 
We have mathematically represented survival as cellular 
growth. On the basis of comparisons of the linear 



optimization results and experimental data, the assump- 
tion appears to be valid under the tested conditions (27). 
However, currently, the number of situations for which 
the validity of the assumption has been addressed is 
limited. Therefore, further experimental validation is in 
order. It should be noted that, even if the assumption of 
optimal growth proves correct for wildtype strains, it is 
not clear if we should expect that an engineered strain 
will behave in an optimal manner. Therefore, the opti- 
mization results may only provide an upper bound on the 
expected behavior of engineered strains. 

The identification of the essential gene products in a 
metabolic network is of fundamental interest (46, 47). 
The metabolic constraining formalism that was described 
here provides an efficient method to study the conse- 
quences of alterations in the genotype and to gain insight 
into the genotype-phenotype relation. The study of the 
removal of individual metabolic enzymes in the central 
metabolic pathways demonstrated fundamental redun- 
dancy properties of the E. coli metabolic genotype and 
the existence of relatively few critical gene products. 
Seven metabolic reactions were determined to be ess- 
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ential in the metabolic network. Of these seven reactions, 
one set of three forms a linear reaction series and another 
set of two forms another linear reaction series; therefore, 
the effect of deleting any of the reactions in the linear 
set is equivalent. Thus, there were basically four different 
metabolic fluxes that were essential: two in the PPP, the 
3-carbon stage of glycolysis, and the first three reactions 
of the TCA cycle. It was shown that there are regions 
where the ability of the metabolic network to support 
growth was not affected by the altered flux levels in the 
essential enzymatic reactions. This metabolic robustness 
was due to the capability of the metabolic network to shift 
the production of redox potential and high-energy phos- 
phate bonds. However, the point where the metabolic 
network was unable to support growth was quantitatively 
derived in silico. At this point, the metabolic network was 
limited in the ability to produce biosynthetic precursors. 
In the biosynthetic precursor limited region, the avail- 
ability of redox potential and high-energy phosphate 
bonds did not limit the growth capability of the cell. 

The robustness analysis presented herein is essentially 
a sensitivity analysis. The sensitivity of the objective 
function to changes in the optimal flux vector was ad- 
dressed. A constraint was added to the metabolic net- 
work, thus altering a single metabolic flux, and a new 
optimal flux vector was calculated. The relation between 
the additional constraint and the objective function was 
examined to investigate the robustness in the system 
with respect to the essential enzymatic reactions. Fur- 
thermore, the relation between the additional constraint 
and optimal flux vector was also examined. The results 
of the sensitivity analysis can be used to interpret the 
optimal metabolic fluxes and their relation to in vivo 
metabolic fluxes. For example, metabolic fluxes for which 
the objective function is highly sensitive are likely to 
obtain in vivo values that are consistent with the optimal 
values. Furthermore, metabolic fluxes to which the 
objective is not sensitive may be able to take on flux 
values in a large range with very little effect on the 
optimal solution. Additionally, other fluxes in the meta- 
bolic network that are sensitive to a given metabolic flux 
may provide an indication of the accuracy of the optimal 
predictions. For example, the TCA cycle flux was reduced 
to approximately 18% of the in silico wildtype with little 
effect on the objective function; however, the PPP fluxes 
were three times the in silico wildtype, fluxes much 
higher than experimental data indicates. 

Understanding the metabolic fluxes and their control 
is essential to the ability to "design" metabolic networks 
for the production commodity chemicals (i.e., antibiotics, 
vitamins, amino acids, etc). Using flux balance analysis 
the complete range of metabolic phenotypes can be exam- 
ined under defined environmental and genetic conditions 
through the use of phenotypic phase planes (PhPP). For 
the E. coli network PhPPs were generated for growth on 
glucose minimal media spanning the uptake rate of the 
carbon source and an intracellular metabolic flux. There- 
fore, the PhPP formalism has provided an efficient meth- 
odology for examining the consequence of altered fluxes 
within the cell. Therefore, bioinformatically based models 
will undoubtedly have a major impact on the develop- 
ment of metabolic engineering {4, 5). Herein, we have 
investigated the effect of altered metabolic flux levels on 
the maximal growth flux, thus quantifying the relation 
between altered flux levels and optimal cellular growth. 
Furthermore, by examining the entire set of constraints 
on the metabolic network, constraints on metabolic flux 
alterations can be identified. 



FBA incorporates no information on enzyme kinetics 
or gene regulation, thus limiting insight into dynamic 
responses. From flux balance analysis it is possible to 
realize some of the fundamental constraints that meta- 
bolic systems are faced with and define the feasible set 
that contains all admissible steady-state flux vectors. As 
in vivo reaction dynamics is further understood, the abil- 
ity to predict dynamic responses of metabolic networks 
to environmental and genetic perturbations using dy- 
namic modeling approaches will become more feasible. 
In general regulatory schemes and reaction dynamics will 
serve to further constrain metabolic behavior to operate 
in confined subspaces of the feasible set. Identifying these 
regions from both the theoretical and experimental side 
will be a challenge for the future. 

A number of experimental technologies have now made 
the holistic study of biological systems feasible. The abil- 
ity to assimilate DNA chip-based and protein expression 
technologies providing genome-scale information with 
computational methods for metabolic network analysis 
will become important in advancing the study of meta- 
bolic physiology and the practice of metabolic engineer- 
ing. Currently the interpretation of high-throughput ex- 
perimental information on systemic behavior is limited 
by a lack of analysis capabilities. Can systems-based 
quantitative in silico approaches such as flux balance be 
used to assist in understanding this flood of data? This 
question will need to be answered as interest builds in 
the genomics community for quantitative systems analy- 
sis. 

The analysis of the metabolic phenotype— genotype 
relation using the bioinformatically based in silico meta- 
bolic genotype of E. coli can serve as a basis for the 
construction of parallel in silico representations of other 
single-cell organisms. Thus, the results presented are 
particularly relevant with the current emphasis on 
genome sequencing. Utilizing the techniques described 
herein, information can be gained regarding the meta- 
bolic physiology of a cell with relatively little experimen- 
tal biochemical information on the cell of interest. 
However, this analysis should be considered a single step 
toward the integrative analysis of bioinformatic data- 
bases to predict and understand cellular function based 
on the underlying genetic content. Continued prediction 
and experimental verification will be an integral part of 
the further development of in silico strains and their use 
to represent their in vivo counterparts. 
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1 The growing scope of applications of genome-scale 
| metabolic reconstructions using Escherichia coli 

£ Adam M Feist 1 & Bernhard 0 Palsson 1,2 
3 

| The number and scope of methods developed to interrogate and use metabolic network reconstructions has 

8 significantly expanded over the past 15 years. In particular, Escherichia coli metabolic network reconstruction has 

2 reached the genome scale and been utilized to address a broad spectrum of basic and practical applications in five 
I main categories: metabolic engineering, model-directed discovery, interpretations of phenotypic screens, analysis of 

| network properties and studies of evolutionary processes. Spurred on by these accomplishments, the field is expected 

I to move forward and further broaden the scope and content of network reconstructions, develop new and novel in 

s silico analysis tools, and expand in adaptation to uses of proximal and distal causation in biology. Taken together, 

a these efforts will solidify a mechanistic genotype-phenotype relationship for microbial metabolism. 



The availability of reconstructed metabolic networks for microorganisms 
' has increased rapidly in recent years, and a growing number of research 
groups are reconstructing metabolic networks for organisms of inter- 
est 1 . A network reconstruction represents a highly curated set of primary 
biological information for a particular organism and thus can be con- 
sidered a biochemically, genetically and genomically structured (BiGG) 
database 1,2 . A curated BiGG database (de facto a knowledge base) can be 
converted into a mathematical format (that is, an in silico model), and 
used to computationally assess phenotypic properties using a variety of 
computational methods 2,3 . Genome-scale reconstructions are thus a key 
step in quantifying the genotype-phenotype relationship and can be used 
jv to 'bring genomes to life' 4 . 

9 The purpose of this review is to summarize and classify applications 
using the £. coli reconstruction to answer a broad spectrum of biological 
questions. By doing so, we provide both an up-to-date overview of the 
applications of constraint-based analysis and a guide to similar appli- 
cations for the growing number of organisms for which genome-scale 
reconstructions are becoming available. 

Model formulation and the E. coli metabolic reconstruction 

The four key steps in the formulation and use of genome-scale models are 
illustrated in Figure 1. Foundational to the process is the generation of 
global, or genome-scale, 'omics data'. Omics data, along with legacy infor- 
mation (that is, the'bibliome') and small-scale detailed experiments, can 
be used to define the interactions among the biological components that 
are used to reconstruct organism-specific networks 1 . Network reconstruc- 
tion is also an iterative, ongoing process that continually integrates data in 
a formal fashion as they become available 5 . As a result, a current and well- 
curated genome-scale network reconstruction is a common denominator 
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for those studying systems biology of an organism; for an in-depth review 
of the bottom-up reconstruction process, see ref. 2. 

The arrow from step 2 to step 3 in Figure 1 involves a somewhat subtle, 
but critical, transition. With the definition of systems boundaries and 
other details, a network reconstruction can be converted into a math- 
ematical format that can be computationally interrogated and subse- 
quently used for experimental design 2 . Thus, a network reconstruction 
is converted into a genome-scale model (GEM) 3 . This arrow represents 
a bridge between the realms of high- throughput data/bioinformatics on 
the one hand and systems science on the other. A network reconstruction 
(or a BiGG knowledgebase) is, in principle, accessible to all and significant 
strides have been made to make computation with GEMs more readily 
accessible 6-1 This availability of both genome-scale reconstruction and 
GEMs has unleashed creativity in research groups around the world and 
resulted in the series of studies reviewed below. 

The 18-year history of reconstruction of the E. coli metabolic net- 
work (summarized in Fig. 2) has culminated in a network containing 
a total number of 1,260 open reading frame (ORF) metabolic func- 
tions 12-19 . This reconstruction represents 48% of the experimentally 
determined ORF functions in the E. coli genome (Table 1). It should 
be noted that the functions of 92% of the 1,260 gene products have 
been experimentally verified. Reconstruction of the E. coli network 
has thus approached an exhaustion of known metabolic gene func- 
tions and it is now being used in a prospective fashion to discover 
new metabolic capabilities (see below). The reconstruction of the E. 
coli metabolic network represents the best-developed genome-scale 
network to date and it has proven to be a platform for a variety of 
computational analyses. It should be noted that although there are 
different E. coli GEMs, there is only one unique network for E. coli and 
each successive iteration strives to best represent this content (we use 
'the E. coli GEM' to refer to any of these iterations). Three successive 
E. coli GEMs from our group 17-19 have been used as the basis for over 
60 detailed studies reviewed below. 

A growing number of research groups use the E. coli GEM for predict- 
ing, interpreting and understanding E. coli phenotypic states and function. 
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Step 2. Knowledge base: i — 
one set of reactions ^ — 
encoded by a genome 



Step 4. Validation, discovery 



Reconstruction of biochemical reaction network 



Figure 1 Formulation and use of GEMs as a 
four-step process. Step 1, the process is based 
on a variety of high-throughput data sets (that is, 
omics data) and a comprehensive assessment of 
the literature (that is, bibliomic data). Step 2, 
all of the data types are used to reconstruct the 
list of biochemical transformations that make 
up a network, as well as their genetic basis 1 . In 
principle, the network is unique. Step 3, the data 
contained in the reconstruction can be formally 
represented (that is, in the form of matrices and 
logical statements) that can be mathematically 
characterized by a variety of methods. Step 4, the 
computational model enables a broad spectrum 
of applications, as reviewed in this article. Figure 
adapted from ref. 2. 



In addition, the reconstruction itself has been used as a context for the 
interpretation of large amounts of experimental data. Applications of 
the E. coli GEM range from pragmatic to theoretical studies, and can be 
classified into five general categories (Fig. 3): first, metabolic engineer- 
ing 20-30 ; second, biological discovery 31-37 ; third, assessment of phenotypic 
behavior 19,38-63 ; fourth, biological network analysis 64-79 ; and fifth, studies 

, of bacteria] evolution 80-82 . The in silico methods used to probe the E. coli 
GEM in each study are summarized in Figure 4. These methods perform 

. an assessment of the solution spaces associated with the mathematical 
representation of a reconstruction 2 ; they are categorized as either unbi- 
ased or biased 3 . The latter category relies on an observer bias that is stated 

' through an objective function (that is now beginning to be experimentally 
examined 83 ) that has been utilized in most of the studies reviewed here for 
the general application of flux balance analysis (FBA) 84-86 . Each category 



of application is now detailed, with emphasis on the first three that have 
the greatest practical utility. 

GEMs and metabolic engineering 

Through the application of computational methods that incorporate lin- 
ear, mixed integer linear and nonlinear programming, it has been demon- 
strated that model-directed strain design can lead to increased metabolite 
production 20-30 . In these studies, the E. coli GEM is principally used to 
analyze the metabolite production potential of E. coli and identify meta- 
bolic interventions needed to produce the metabolite of interest. Thus, 
E. coli strains have been systematically designed through in silico analysis 
to overproduce target metabolites, such as lycopene 23,24 , lactic acid (our 
group 25 ), ethanol 26 , succinic acid 27 ' 28 , l- valine 29 , l- threonine 30 , additional 
amino acids 21 as well as diverse products from hydrogen to vanillin 22 . 
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Figure 2 The iterative reconstruction and history of the E. coli metabolic network. Six milestone efforts are shown that contributed to the reconstruction 
of the E. coli metabolic network. For each of the six reconstructions 12-19 , the number of included reactions (blue diamonds), genes (green triangles) and 
metabolites (purple squares) are displayed. Also listed are noteworthy expansions that each successive reconstruction provided over previous efforts. 
For example, Varma & Palsson 13 ' 14 included amino acid and nucleotide biosynthesis pathways in addition to the content that Majewski & Domach 12 
characterized. The start of the genomic era 92 (1997) marked a significant increase in included reconstruction components for each successive iteration. The 
reaction, gene and metabolite values for pregenomic-era reconstructions were estimated from the content outlined in each publication and in some cases, 
encoding genes for reactions were unclear. 
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Select exemplary metabolic engineering applica- 
tions are described in more detail below. 

To increase the production of an already high 
producing strain, a systematic computational 
search was developed 24 to explore the E. coli 
metabolic network and report gene deletions 
that diverted metabolic flux toward the desired 
product lycopene. This process resulted in a 
>, knockout strain, that when constructed, showed 
o a twofold increase in the production of lycopene 
c over the parental strain. In this analysis, the 
o computational algorithm MOMA (minimiza- 
o tion of metabolic adjustment) 41 and the ;'JE660 
■g (ref. 18) E. coli GEM were used to sequentially 
3 examine additive genetic deletions that would 
c improve lycopene production while maintain- 
| ing cell viability. Strain designs were constructed 
Jj- through genetic manipulations using the pre- 
= dieted modifications and this computational 
c approach yielded a twofold increase in produc- 
| tion rate over a previously engineered over- 
5 producing strain and an 8.5-fold increase over 
a wild- type production harboring only a lycopene 
S biosynthesis plasmid 24 . Strain performance was 
Q. evaluated by monitoring lycopene production 



Table 1 Properties of the most current E. coli metabolic reconstruction*? 



through enzymatic assays and mutant growth 
rates. In addition, when the strain designs iden- 
' tified computationally were compared with 
mixed combinatorial transposon mutagenesis, 
the maximum observed production could be 
designed solely using the systematic GEM-aided 
computational method 23,24 . Furthermore, a del- 
eterious effect was observed when targets identi- 
fied in individual computational designs were 
combined in an attempt to achieve an overall 
more desirable phenotype. Thus, the overall 
systematic effects from individual designs were 
[v not additive and needed to be interpreted in the context of t 
■network. 

Two studies producing the amino acids L-valine 29 and L-thi 
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The second analysis applied the systematic computational search algo- 
rithm previously described 24 to the updated E. coli GEM MBEL979 (ref. 
7) (which is similar to the iJR904 GEM 17 ) to improve L-valine produc 



have demonstrated the broad usage of GEM-aided computation for tion. The in silico analysis of beneficial knockouts to divert flux toward 
11 design. To generate an L-threonine production strain, GEM-aided the desired product once again resulted in a significa 



modeling was employed in three different at 



in of the desired metabolite over that of an existing overproduc- 



industrial titers 30 . In one instance, in silico parametric sensitivity analysis ing strain, in this case, a more than a twofold increase 29 . Furthermore, the 
that compared reaction activity to L-threonine production rate was used authors followed several additional metabolic engineering approaches 



o identify the optimal activity of a key enzymatic reaction associated 
with maximal L-threonine production. The optimal activity prediction 



increase overproduction (that is, relieving feedback inhibition and 
regulation through attenuation, removing competing pathways, upregu- 



was subsequently used to tune the overexpression of the gene encoding lation of primary biosynthetic pathways and overexpression of exporting 
the enzyme involved in this reaction through comparison to base-line machinery). When compared with each of the other individual st 



activity and the result was a production increase. This method proved 
be vital to the success of this strain, as a previous transcription-profiling 
guided attempt at overexpression resulted in an undesirable surplus of 
activity and was detrimental to L-threonine production. For the same 
strain, a GEM-aided flux analysis in conjunction with mRNA expression improvements on 
data levels also was used as a guide for elimination of negative regulation mental data, 
of a gene encoding an enzyme involved in a reaction that channeled flux 
toward the final product. The third use of the GEM for the design of 



modifications, the in silico GEM-aided interventions resulted in the great- 
increase in L-valine production 29 . Taken together, these two stud- 
demonstrate the broad applications for which GEMs can be used 
only to design strains made novo fashion, but also to make further 
through integrating and interpreting experi- 



Several other strain designs using E. coli GEMs have been reported. In 
. combined computational and experimental study, our group 25 has used 



this strain occurred when an unwanted byproduct was observed in the the bi-level optimization algorithm OptKnock 20 and iJR904 (ref. 17) tc 

culture medium and computation was used to divert the flux from this overproduce lactate in E. coli. The algorithm OptKnock optimizes tv 

byproduct to L-threonine 30 through overexpression of another key gene- objective functions, biomass formation and product : 

encoded activity. the creation of strains that can couple the i 



NATURE BIOTECHNOLOGY VOLUME 26 NUMBER 6 JUNE 2008 



REVIEW 




Figure 3 Applications of the genome-scale 
model (GEM) of E. coli divided into five 
categories. (1) A drawing of a predicted effect 
from a loss-of-function mutation in a simple 
system is shown. Metabolic engineering studies 
have investigated in silico strain design using 
E. coli metabolic reconstructions to overproduce 
desired products 20-30 . (2) Recent studies using 
the reconstruction in a prospective manner 
have aimed to use the current biochemical and 
genetic information included in the metabolic 
network along with additional data types to drive 
biological discovery, such as predicting genes 
responsible for orphan reactions 32 ' 33 ' 35-37 . 
(3) Using the reconstruction in phenotypic 
studies, computational analyses have examined 
gene 19 ' 46 . 51 ' 53 ' 63 , metabolite 44 - 60 and 
reaction 39 ' 47 ' 48 - 58 essentiality along with consid- 
ering thermodynamics 19 ' 40 ' 47 ' 49 ' 52 ' 54 ' 55 ' 57 ' 61 to 
make better predictions about the physiological 
state (that is, the active pathways) of the cell for 
a given environmental condition. (4) The E. coli 
reconstructions have been used to analyze and 
interpret the intrinsic properties of biological 
networks, one example being finding coupled 
reaction activities 66 (as shown in the drawing) 
across different growth conditions. 
(5) Using the network reconstruction, 
evolutionary studies have examined the cellular 
network in the context of adaptive evolution 
events 81 , horizontal gene transfer 80 ' 81 and 
minimal metabolic network evolution 82 . 



Phenotypic behavior 



to the growth rate. Using adaptive evolution with growth-rate selection 
pressure, we found that the lactate-producing strains designed using 
OptKnock possess this growth-coupling property (as measured by growth 
rate, uptake and secretion rate profiles); thus, this study demonstrated the 
utility of adaptive evolution as a design tool 87 . 
Additional noteworthy examples of GEM-aided design are two stud- 
; 27 > 28 demonstrating the use of GEM modeling based on iJR904 (ref. 17) 
screening for genes of putative importance in succinate production. 
Combinatorial knockouts that were predicted to be overproducers in silico 
were experimentally verified to display the same overproducing pheno- 
type in vivo. Furthermore, this method was shown to be superior to the 



use of comparative genomics for strain design, which was also performed 
in one of the studies 27 . 

A growing number of metabolic engineering studies thus demonstrate 
the use of GEMs to generate strain designs that are often nonintuitive and 
nonobvious. An excellent example of a nonintuitive strain improvement 
outlined in this section was when modeling was used not only to study 
the effect of a gene removal, but also to tune the expression of a gene to 
an optimally predicted level, which when surpassed, was detrimental to 
product formation. In this manner, genome-scale reconstructions allow 
the examination and simulation of metabolism as an integrated network, 
circumventing the possible shortcomings of methods that rely on manual 




Figure 4 Summary of the in silico methods used in the 64 published E. coli GEM studies reviewed here. This heat map characterizes the incorporation 
of different computational methods into studies using genome-scale models of E. co //20,22,26 1 3o,4o,4i,45,65-67,84-86,i06-i08 > /\ farY. box indicates that a 
particular method (one method per row) was used in a corresponding study (one citation per column); the frequency of usage of a particular method is given 
on the right. Studies were grouped into one of five general categories, and studies examining phenotypic behavior were further divided into three subgroups. 
Studies that contributed new experimental growth data are also marked along the bottom offset row. 
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Figure 5 Comparison of computation and experimental data: identification 
of agreements and disagreements. The comparison of GEM computation 
and organism-specific experimental measurements identifies agreements 
and disagreements. For comparison, phenotypic outcomes are tabulated 
for genetic perturbations examined in a given environment (e.g., growth 
or no growth) for both experiment and computation. A '+' indicates that a 
given phenotype is not affected by the perturbation; an '-' indicates it does. 
Each outcome of comparison has a different implication: 1, consistency 
check, a perturbation has no affect on the property being measured and 
modeling predicts the same; 4, validation, the perturbation affects the 
>, experimental outcome and modeling with the GEM predicts this outcome; 
o 2, identification of missing content — when GEM modeling fails to predict 
* the positive confirmation of the property being measured, this outcome 
■g indicates that there is missing content in the GEM and can lead to the 
£ identification of specific areas for biological discovery; 3, identification 
~ of errors, inconsistencies or missing context-specific information — a 
£ positive prediction for the measured property and an opposite experimental 
■2 observation indicates a possible error in the current organism-specific 
= knowledge or that additional context-specific information is lacking from the 
| GEM or modeling method (e.g., transcriptional regulation). 
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it of a limited number of interactions and fail to detect non- 
re causal interactions. With the growing availability of organism- 
^ and strain-specific GEMs, applications for designing microbial strains 
& for industrial production are expected to continue to grow. This growth 
* expectation is in part based on the ongoing reconstruction of additional 
a. cellular processes, such as transcriptional regulation and protein produc- 
g tion. Computations based on genome-scale models are also beginning 
O to influence other areas of industrial microbiology such as generation of 
c renewable energy 88-90 and bioremediation 89 . 

5 GEM-driven discovery in E. coli 

£ GEMs can also provide a guide to biological discovery. This capability is 
o based on comparison of computed and actual experimental outcomes. 
£ Given the fact that BiGG knowledge bases are incomplete and that they 
Z contain gaps 91 , they provide a context for systematic discovery of missing 
g information. The comparison between computation and experiments is 
g summarized in Figure 5 highlighting how agreements and disagreements 
© are analyzed. 

43(<n The current area of most interest is direct discovery efforts toward char- 
■SBacterizing unknown ORFs in the E. coli genome. Ten years after the first 
^ release of the complete genome sequence 92 , many unknown ORFs still 
exist in the E. coli genome (Supplementary Table 1 online), with many 
of these likely to encode metabolic functions. ORF discovery using GEMs 
also has significant potential to affect not only how new and less stud- 
ied genomes are annotated, but to fill out the missing pieces of E. coli 
metabolism. 

To address this challenge, researchers have developed algorithms to 
determine the probable gene candidates that fill knowledge gaps in the 
E. coli and other network reconstructions. These algorithms use global 
network topology and genomic correlations, such as genome context and 
protein fusion events 32 , as well as local network topology and/or phylo- 
genetic profiles 32,33 . Similar tools have been developed that use mRNA 
coexpression 93 and which can evaluate more general metabolic pathway 
databases 94 . In addition to these network topology-based methods, an 
optimization-based procedure has also been developed to fill network gaps 
and evaluate reaction reversibility along with adding additional transport 
and intracellular reactions from databases of known metabolic reactions 36 . 
These studies produce specific targets for drill-down (that is, detailed bio- 
chemical characterization) experiments needed for confirmation of these 
computationally generated hypotheses. 

Two recent studies have integrated a combined computational and 
experimental approach to aid the ORF discovery process in E. coli through 



using a GEM and high-throughput phenotype data 35,37 . The first study 
(from our group 35 ) used an iterative process in which (i) differences in 
modeling predictions and high- throughput growth phenotype data were 
identified, (ii) potential missing reactions that remedy these disagreements 
were algorithmically determined, (iii) bioinformatics was used to identify 
likely encoding ORFs and (iv) resulting targeted ORFs were cloned and 
experimentally characterized. Application of this process led to the func- 
tional characterization of eight ORFs involved in transport, regulatory 
and metabolic functions in E. colP 5 . The discovery process was aided by 
a high-throughput growth phenotyping analysis and the genome-wide 
single-gene mutant collection 95 , along with other characterization analy- 
ses such as targeted expression profiling. 

The second GEM-based analysis resulting in ORF discovery used net- 
work topology to examine orphan reactions in the E. coli network (that is, 
reactions known to exist in E. coli that have not been linked to an encoding 
gene) identified by the previously mentioned network topology-based 
gap-filling algorithms 32,33,93 . The basic premise behind these algorithms 
is the utilization of an orphan reaction's network neighbors as constraints 
to assign metabolic function. With the resulting tentative ORF assignment, 
biochemical characterization studies using genetic mutants 95 , analysis of 
growth under different substrate conditions and expression data were all 
used to characterize and assign function to an orphan ORF that is respon- 
sible for a metabolic conversion that has been known for 25 years 37 . 

Further studies in this category of biological discovery applications 
(not focused on ORF identification) have used GEMs of E. coli to identify 
potential bottleneck reactions in the metabolic network 34 and as of yet 
uncharacterized transcription factor target interactions in E. coli M . The 
former study, targeting the elucidation of regulatory and metabolic inter- 
actions in E. coli, developed an iterative procedure focused on reconciling 
computational and experimental discrepancies stemming from high- 
throughput growth phenotype and gene expression data where selected 
expression changes were validated using RT-PCR 31 . With the advance- 
ment of high-throughput technologies to test the hypotheses generated 
from computational studies, these and similar algorithmic approaches are 
likely to continue to aid in the quest to achieve full functional annotation 
of the E. coli genome and its context-specific uses. 

GEM-aided phenotypic assessment 

The area where the E. coli GEMs has been most extensively used is for 
the examination and quantitative interpretation of metabolic physiology 
for wild-type, genetically perturbed and adaptively evolved strains of E. 
co/x 19,38-63 . These efforts have implications in both the quantitative and 
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qualitative understanding of physiological states of the cell. Furthermore, 
these efforts have examined E. coli physiology for a vast number of given 
genetic and environmental conditions and incorporation of the developed 
methods will have an impact on future design of biological systems and 
modeling approaches. A large subset of these studies of phenotypic behav- 
ior aim to use thermodynamic laws and information to refine phenotype 
predictions of GEMs and to incorporate metabolomic and fiuxomic data 
intomodeling 19,40,47,49,52,54,55,57,61 . 
>, A set of distinct computational methods using GEMs has been devel- 
o oped to determine the physiological state of E. coli after genetic per- 
g turbations 41,45 ' 50 . These studies have used 13 C flux measurements and 

0 growth- rate phenotype data to evaluate the predictability of the developed 
t> algorithms when compared to experimental observations. Whereas com- 
■§ parisons to flux data from wild-type and E. coli mutants reveal that the 
3 computational algorithm MOMA 41 provides better predictions for tran- 
= sient growth rates (early post-perturbation state), the algorithm ROOM 
| (Regulatory on/off minimization) 45 and basic FBA was more successful 
J- in predicting final steady-state growth rates and overall lethality 45 . These 
= algorithms have been used, in addition to basic FBA, for genome-wide 
c essentiality screens, as outlined below. 

| A range of computational studies have sought to understand pheno- 

1 types through determining the essential genes 19,46,51,53,63 , metabolites 44,60 
a and reactions 39,47,48,58 in the E. coli metabolic network. A common bench- 
.c mark for examining GEM predictive ability is to determine the agreement 
o. with growth phenotype data from knockout collections of E. coli. Such 
g studies (e.g., refs. 19,53) will be further enabled by a comprehensive single- 
ts gene knockout library for E. coli recendy made available 95 . Implications 
c for examining network essentiality in E. coli include deterrnining network 
£ essentiality in similar organisms 39,48,53,58 , deciphering network makeup 

2 and enzyme dispensability (that is, measures of robustness) 46,58,50 , aiding 
(jjj in metabolic network annotation, validation and refinement 44 , and even 
g rescuing knockout strains through additional gene deletions 63 , to name 

3 a few. The predictive capability of the E. coli GEM, as demonstrated by 
2 these studies, has been instrumental in adapting it for different uses. One 
g particular study examining knockout phenotypes has demonstrated that 
g the E. coli GEM was able to predict the outcomes of adaptively evolved 
© strains to a high degree (78%) when knockout E. coli strains were grown 

|SUv in several different substrate environments by examining growth rates at 
[SJthe beginning and end of adaptive evolution 43 . This study represents a 
demonstration of a GEM's ability to look at adaptive behavior (or 'distal' 
causation 96 ), in addition to immediate behavior (or 'proximal' causa- 
tion 96 ). Predictive capability is expected to improve through examining 
growth behavior across a greater number of environments (additional 
phenotyping screens will be necessary) and with an increase of integration 
of additional cellular processes 31 . Genetic perturbations have played a key 
role in the study of the genotype-phenotype relationship in biology and 
GEMs can be used to mechanistically interpret the results and predict the 
outcomes of such perturbations. 

Incorporating thermodynamic information into E. coli GEMs has 
shown promise in narrowing predictions of allowable physiological states 
in a given environment 19,40,47,49,52,54,55,57 ' 61 and in identifying reactions 
likely to be subject to active allosteric or genetic regulation 49,54 . This field is 
progressing rapidly and should prove to increase the predictive capabilities 
of genome-scale modeling through the addition of governing thermody- 
namic physiochemical constraints. One particular analysis incorporating 
compound formation and reaction energies for the content of the GEM 
based on iJR904 (ref. 1 7) identified reactions that are likely to be effectively 
irreversible for any realistic metabolite concentration 54 . The hypothesis 
was advanced that these reactions are candidates for cellular regulation 
in their respective pathways because enzyme regulation will likely be the 
dominant mechanism for control of flux through these reactions 54 . 



The addition of thermodynamics in GEM modeling enables the analysis 
of metabolomic data in the context of a reconstruction. A study using 
high-throughput metabolomic data and GEMs proposed likely regulatory 
interactions by deciphering the metabolite concentrations in the context 
of overall network functionality 49 . Not only did the metabolomic data 
benefit computations by constraining the system using physiological mea- 
surements, but the computational predictions were also able to validate 
quantitative metabolomic data sets for consistency through providing a 
functional context to relate metabolite concentrations. This application 
is one example of how metabolomic data will direcdy influence model- 
ing and how metabolite concentration data are likely to greatly influence 
future metabolic modeling owing to its intimate connection with GEM 
content. Similar work incorporating other quantitative values with FBA, 
such as metabolite concentrations 57 and flux ratios at branch points in 
metabolism 56 , is also appearing. 

Applying a different physiochemical constraint, molecular crowding, 
a framework has also been developed to incorporate spatial constraints 
into FBA 59 . The functional states predicted with this method (that is, FBA 
with molecular crowding; FBAwMC) and the E. coli GEM were validated 
against generated growth, substrate and production rate data along with 
gene expression profiles and enzyme activity measures to demonstrate 
predictive accuracy, including substrate preferentiality, when examining 
growth in complex substrate environments 59,62 . Overall, these studies, 
which incorporate reaction thermodynamics and additional cellular con- 
straints, should further narrow the range of allowable functional network 
states that can made based on stoichiometry alone and thus improve the 
utility of GEMs. 

In addition to analyses on the genomic scale, several studies modeling 
the metabolism of E. coli on a smaller scale have been performed. These 
analyses typically use models containing -100 reactions or less and most 
often focus on incorporating nonlinear analysis to understand quantita- 
tive experimental data (e.g., isotopomer modeling). With the advance- 
ment of computational power and developed platforms, the networks 
that can be analyzed will grow in size 97 . Given that the results produced 
from such analyses as isotopomer modeling have been shown to be highly 
dependent on the content of a reduced model, the logical starting point 
for building such models is the E. coli GEM 97 . Several noteworthy studies 
have been conducted with reduced models, but are not detailed here as 
they are outside the scope of this review. 

GEMs and network property analysis 

E. coli is generally viewed as having the most complete characterization 
of any model organism 98,99 . Because of the incorporation of thousands 
of metabolic interactions with relatively high reliability (e.g., 92% of 
the genes included in the latest reconstruction of E. coli 19 have experi- 
mentally determined annotated functions 99 ; Table 1), validated genome- 
scale reconstructions of E. coli have become popular resources for the 
analysis of various network properties 64-79 . The methods designed to 
analyze the underlying network structure of E. coli metabolism, some 
characterizing its interplay with regulation, have been developed to 
determine several physiological features. These features include the 
most probable active pathways and metabolites used under all possible 
growth conditions 67,69,73,75 , the existence of alternative optimal solutions 
and their physiological significance 65 , conserved intracellular pools of 
metabolites 68 , coupled reaction activities 66 and their relationship to gene 
co-expression 77 , metabolite coupling (our laboratory 71 ), metabolite 
utilization 72 , the organization of metabolic networks 64,76 , strategies for 
E. coli to incorporate metabolic redundancy 78 and the dominant func- 
tional states of the E. coli network across various environments 70,74,79 . 
These findings are driven both by biased approaches using FBA and 
biomass objective function optimization and by unbiased approaches 
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such as graph-based analyses (see Fig. 4). One noteworthy study using 
the GEM-outlined network examined thousands of different potential 
growth conditions and observed a 'high-flux backbone' in E. coli that both 
carried high levels of flux across the different environmental conditions 
and was composed of a relatively small set of enzymatic reactions 67 . This 
result can be of practical importance for synthetic biology efforts aimed 
toward manipulating flux within biological systems. Furthermore, this 
finding was hypothesized to be a universal feature of metabolic activity 

>, in all cells and was consistent with flux measurements from 1 3 C- labeling 

o experiments 67 . 

c The studies in this category have a common systems biology theme; 
o namely, the development and su bsequent demonstration of methods that 

0 identify sets of reactions or metabolites with correlated or coordinated 
gj functions and systematic relationships. The systems biology that these 
3 methods enable and demonstrate has potential implications for (i) antimi- 
c crobial drug-target discovery 68,69 , (ii) aidingthe development of additional 
| metabolic reconstructions 66 ' 68 , (iii) guiding genetic manipulations 66 , (iv) 

2 improving metabolic engineering applications 67,68 and (v) increasing the 
= general understanding of biological network behavior 65,74,77 and resil- 
c ience 78 . The role that the E. coli GEM has taken is a comprehensive and 

1 curated set of up-to-date metabolic knowledge; thus providing a scaffold 
§ for these large-scale computations. 

|- 

j= GEMs and bacterial evolution 

a- The GEMs of E. coli metabolism have been used to examine the process 
g of bacterial evolution 80-82 . Specifically, the network reconstructions have 
O been used to interpret adaptive evolution events 81 , horizontal gene trans- 
c fer 80,81 and evolution to minimal metabolic networks 82 . These studies, 
■{■■ which use the E. coli reconstruction as an organism-specific genetic and 

3 metabolic content database, and the corresponding GEM, have been able 
£ to provide insight into evolutionary events through combining known 
a> physiological data (e.g., in various environmental conditions) with 
3 hypotheses and in silico computation. Examining the evolution of minimal 
Z metabolic networks through simulation demonstrated that it was possible 
g to predict the gene content of close relatives of E. coli by examining the 
g necessity of genes and reactions in the overall context of the system func- 
© tionality for a specific lifestyle 82 . Similarly, byre-examining network func- 

ttionality in a number of different environments and through the use of 
comparative genomics, it has been shown that recent evolutionary events 
(e.g., horizontal gene transfer) probably resulted from a response to a 
change in environment 81 . Furthermore, computational analysis led to the 
additional conclusion that these horizontal gene transfer events are more 
likely if the host organism contains an enzyme that catalyzes a coupled 
metabolic flux related to the transferred enzyme's function 80,81 . Taken 
together, these studies demonstrate the importance of having high-qual- 
ity curated reconstructions to enable studies on an organism's response 
to environmental changes and for understanding the fundamental forces 
driving bacterial evolution. 

Conclusions 

Since the first review on constraint-based methods appeared in Nature 
Biotechnology in 1994 (ref. 84), the field has grown rapidly. The myriad 
studies described in this article highlight the rapid development and use 
of genome-scale reconstruction and derived computational models to 
address a growing spectrum of basic research and applied problems. 
Experience with genome-scale reconstructions has demonstrated that 
they are a common denominator in the systems analysis of metabolic 
functions. With the' recognition of its basic paradigms and a growing 
spectrum of practical uses enabled, this field now faces several exciting 
challenges in three major areas, including: first, network reconstructions 
and the reconstruction process; second, computational BiGG query tools 



(that is, modeling); and third, application to proximal and distal causa- 
tion in biology. 

The scope of reconstructions is bound to grow, representing more and 
more BiGG knowledge in the structured format of a GEM 91 . Growth in 
scope in the near term will on one front involve the transcriptional and 
translational machinery of bacterial cells 100-102 . Such an extension will 
enable a range of studies including the direct inclusion of proteomic data, 
fine graining of growth requirements and the explicit consideration of 
secreted protein products. Another expansion in scope in the near term 
is the reconstruction of the genome-scale transcriptional regulatory net- 
work (TRN). Such reconstruction at the genome-scale is now enabled 
by new experimental technologies, such as chromatin immunoprecipita- 
tion (ChlP)-chip 103 . Experimental interrogation of the currently available 
TRN suggests that we know about one- fourth to one-third of its content 3 1 , 
indicating that there is much to be discovered. Once reconstructed, the 
TRN will allow computational predictions of the context-specific uses of 
the E. coli genome and the responses of two-component signaling systems. 
Taken together, these near-term expansions in content will encompass the 
activity of apparently 2,000 ORFs in the E. coli genome. 

Mid-term expansions in scope will include the growth cycle, shock 
responses and additional cellular functions. Such a reconstruction should 
eventually be a comprehensive representation of the chemical reactions 
and transactions enabled by E. coli's gene products. Longer-term recon- 
struction may begin to address the three-dimensional organization of 
the bacterial cell. In particular, high-resolution ChlP-chip data on DNA- 
binding proteins could enable not only the estimation of the topological 
arrangement of the genome but also the elucidation of the structure of 
the cell wall and other cellular structures that will allow us a full three- 
dimensional reconstruction of E. coli. 

We now know how to represent BiGG data in either a stoichiometric 
format or in the form of causal relationships (e.g., see ref. 104 from our 
laboratory) and how to use them to perform several lines of computa- 
tional inquiries. Computational query tools of GEMs will continue to be 
developed. New advances will likely include modularization methods, use 
of fluxomic data and eventually kinetics. As the scope and content of the 
reconstruction grows, the need to modularize its content becomes more 
pressing. Fine- or coarse-grained views of cellular processes are needed 
for different applications. For instance, as previously mentioned, current 
computational limitations force the reduction in a network for the analy- 
sis of isotopomer data, and a rational way to carry out such reduction is 
needed. Given the systemic nature of fluxomic data and its phenotypic 
relevance, there is a pressing need to increase the size of the networks that 
can be analyzed for experimental measurement and estimation of flux 
states. Finally, although detailed kinetic models of microbial functions 
may currently be mostly of academic interest, we will most likely be able 
to construct them in the mid- term based on advances with metabolomic 
and fluxomic data, in addition to the developments that are occurring 
with the incorporation of thermodynamic information. Such large-scale 
kinetic models are likely to differ from those resulting from traditional 
approaches for construction of kinetic models as they come with differ- 
ent challenges. 

As this article shows, the scope of applications of genome-scale recon- 
structions and GEMs is growing. Going forward, we wish to comment 
on three categories of applications: growth in coverage (that is, gap-fill- 
ing), engineering (that is, synthetic biology) and the development of our 
understanding of fundamental biology (see Step 4, Fig. 1). Growth in 
coverage will come through discovery of missing network components. 
For instance, the latest metabolic reconstruction, iAF1260, contains 
14% blocked reactions 19 . This disconnected content means that we have 
knowledge gaps that have arisen due to characterization of individual 
gene products outside the context of a given physiological function (that 
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is, outside a defined pathway) . Metabolomic profiling is one measure that 
will provide us with the missing upstream or downstream routes to such 
dead ends in the network. Also, an expansion of scope in modeling will 
allow further investigation of network content, such as tRNA-charging 
reactions that are currendy in this blocked reaction set 19 . Furthermore, 
growing metabolomic data suggest that we are discovering the existence 
of several new metabolites. Pathways that include these metabolites need 
to be discovered. Methods exist to compute missing pathways between 
>, molecules 1 05 that can be applied to such data. Such pathways, in turn, will 
o" lead to experimental programs to discover novel gene functions and to 
c validate or refute the existence of such pathways. Similarly, we expect that 
o a number of the components of TRNs are missing, such as new small RNA 
o molecules (see Supplementary Table 1). Clearly, maintaining the quality 
■§ control/quality assurance of such reconstructions will help in guiding 
3 us to a comprehensive genome-scale representation of all major cellular 
c processes in bacteria at the BiGG data level of resolution that, in turn, 
| enables GEMs of growing coverage and resolution. 
^ Predictive models allow design. In fact, in engineering, there is 'nothing 
a more useful than a good theory'. As tliis article demonstrates, genomics 
c and high-throughput technologies have enabled the construction of pre- 
| dictive computational models. The scope of such predictions is limited at 
J the moment, but with the growing scope and coverage of genome-scale 
a. reconstructions and advancements in the development of computational 
2: tools, this scope will broaden. Not only will GEMs influence design in 
n. synthetic biology, but their influence in discovery of cellular content will 
g provide a more complete picture of the environment (that is, the parts list 
W in the cell) in which future synthetically engineered constructs and circuits 
c will be placed. The impact of GEMs on synthetic biology is thus likely to 
■gj be notable; ranging from the provision of the cellular context of a small- 
2 scale gene circuit design to engineering of the entire genome-scale network 
£ toward fundamentally new and useful (that is, production) phenotypes. 
4> Finally, we can speculate about the deep scientific impact that com- 
2 prehensive predictive GEMs will have on our understanding of the living 
Z process. A comprehensive view of cellular functions will allow us to study 
g the fundamental properties of both the underlying energy and informa- 
g tion flows in living organisms. Such a view is likely to deeply affect our 
© understanding of both distal and proximal causation in biology. 

ote: Supplementary information is available on the Nature Biotechnology website. 
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The Escherichia coli MG1655 genome has been completely se- 
quenced. The annotated sequence, biochemical information, and 
other information were used to reconstruct the E. coli metabolic 
map. The stoichiometric coefficients for each metabolic enzyme in 
the E. coli metabolic map were assembled to construct a genome- 
specific stoichiometric matrix. The E. coli stoichiometric matrix was 
used to define the system's characteristics and the capabilities of 
E. coli metabolism. The effects of gene deletions in the central 
metabolic pathways on the ability of the in silico metabolic net- 
work to support growth were assessed, and the in silico predictions 
were compared with experimental observations. It was shown that 
based on stoichiometric and capacity constraints the in silico 
analysis was able to qualitatively predict the growth potential of 
mutant strains in 86% of the cases examined. Herein, it is demon- 
strated that the synthesis of in silico metabolic genotypes based on 
genomic, biochemical, and strain-specific information is possible, 
and that systems analysis methods are available to analyze and 
interpret the metabolic phenotype. 

bioinformatics | metabolism | genotype-phenotype relation | flux balance 

The complete genome sequence for a number of microor- 
ganisms has been established (The Institute for Genomic 
Research at www.tigr.org). The genome sequencing efforts and 
the subsequent bioinformatic analyses have defined the mo- 
lecular "parts catalogue" for a number of living organisms. 
However, it is evident that cellular functions are multigeneic 
in nature, thus one must go beyond a molecular parts catalogue 
to elucidate integrated cellular functions based on the molec- 
ular cellular components (1). Therefore, to analyze the prop- 
erties and the behavior of complex cellular networks, one 
needs to use methods that focus on the systemic properties of 
the network. Approaches to analyze, interpret, and ultimately 
predict cellular behavior based on genomic and biochemical 
data likely will involve bioinformatics and computational 
biology and form the basis for subsequent bioengineering 
analysis. 

In moving toward the goal of developing an integrated de- 
scription of cellular processes, it should be recognized that there 
exists a history of studying the systemic properties of metabolic 
networks (2) and many mathematical methods have been de- 
veloped to carry out such studies. These methods include 
approaches such as metabolic control analysis (3, 4), flux balance 
analysis (FBA) (5-7), metabolic pathway analysis (8-11, 69), 
cybernetic modeling (12), biochemical systems theory (13), 
temporal decomposition (14), and so on. Although many math- 
ematical methods and approaches have been developed, there 
are few comprehensive metabolic systems for which detailed 
kinetic information is available and where such detailed analysis 
can be carried out (see refs. 15-17 for a few noteworthy 
exceptions). 

To analyze, interpret, and predict cellular behavior, each 
individual step in a biochemical network must be described, 
normally with a rate equation that requires a number of kinetic 



constants. Unfortunately, it currently is not possible to for- 
mulate this level of description of cellular processes on a 
genome scale. The kinetic parameters cannot be estimated 
from the genome sequence and these parameters are not 
available in the literature. In the absence of kinetic informa- 
tion, it is, however, still possible to assess the theoretical 
capabilities of one integrated cellular process, namely metab- 
olism, and examine the feasible metabolic flux distributions 
under a steady-state assumption. The steady-state analysis is 
based on the constraints imposed on the metabolic network by 
the stoichiometry of the metabolic reactions, which basically 
represent mass balance constraints. The steady-state analysis 
of metabolic networks based on the mass balance constraints 
is known as FBA (7, 18, 19). This analysis differs from detailed 
kinetic modeling of cellular processes, in that it does not 
attempt to predict the exact behavior of metabolic networks. 
Rather it uses known constraints on the integrated function of 
multiple enzymes to separate the states that a system can reach 
from those that it cannot. Then within the domain of allowable 
behavior one can study the genotype-phenotype relation, such 
as the stoichiometric optimal growth performance in a defined 
environment. 

In this manuscript, we have used the biochemical literature, 
the annotated genome sequence data, and strain-specific infor- 
mation, to formulate an organism scale in silico representation 
of the Escherichia coli MG1655 metabolic capabilities. FBA 
then was used to assess metabolic capabilities subject to these 
constraints leading to qualitative predictions of growth 
performance. 

Materials and Methods 

Definition of the E. coli MG1655 Metabolic Map. An in silico repre- 
sentation of E. coli metabolism has been constructed. We have used 
the biochemical literature (20), genomic information (21), and the 
metabolic databases (22-24). Because of the long history of E. coli 
research, there was biochemical or genetic evidence for every 
metabolic reactions included in the in silico representation, and in 
most cases, there was both genetic and biochemical evidence (Table 
1). The complete list of genes included in the in silico analysis is 
shown in Table 1, and the metabolic reactions catalyzed by these 
genes can be found on the web (http://gcrg.ucsd.edu/ 
downloads.html). The stoichiometric coefficients for each meta- 
bolic reaction within this list were used to form the stoichiometric 

Determining the Capabilities of the E. coli Metabolic Network. The 

theoretical metabolic capabilities of E. coli were assessed by FBA 
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EXHIBIT 8 



Table 1. The genes included in 

Central metabolism (EMP, PPP, 
TCA cycle, electron transport) 



Alternative carbon 



Amino acid metaboli 



Lipid metabolism 
Cell wall metaboli 



the E. co// metabolic genotype (21) 

aceA acefi, aceE, aceF, ackA, acnA, acnB, acs, adhE, agp, appB, appQ atpA, atpB, atpC, atpD, atpE, atpF, 
atpG, atpH, atpl, cydA, cydB, cydC, cydD, cyoA, cyoB, cyoC, cyoD, did, eno, fba, ibp, fdhF, fdnG, fdnH, 
fdnl, fdoG, fdoH, fdol, frdA, frdB, frdC, frdD, fumA, fumB, fumC, galM, gapA, gapCJ, gapC_2, glcB, 
glgA, glgC, glgP, glk, glpA, glpB, glpC, glpD, gltA, gnd, gpmA, gpmB, hyaA, hyaB, hyaC, hybA, hybC, 
hycB, hycE, hycF, hycG, icdA, IctD, IdhA, IpdA, malP, mdh, ndh, nuoA, nuoB, nuoE, nuoF, nuoG, nuoH, 
nuol, nuoJ, nuoK, nuoL, nuoM, nuoN, pckA, pfkA, pfkB, pflA, pfIB, pfIC, pfID, pgi, pgk, pntA, pntB, ppc, 
ppsA, pta, purT, pykA, pykF, rpe, rpiA, rpiB, sdhA, sdhB, sdhC, sdhD, sfcA, sucA, sucB, sucC, sucD, talB, 
tktA, tktB, tpiA, trxB, zwf, pgl (30), maeB (30) 
adhC, adhE, agaY, agaZ, aldA, aldB, aldH, araA, araB, araD, bglX, cpsG, deoB, fruK, fucA, fuel, fucK, fucO, 
galE, galK, galT, galU, gatD, gatY, glk, glpK, gntK, gntV, gpsA, lacZ, manA, melA, mtID, nagA, nagB, 
nanA, pfkB, pgi, pgm, rbsK, rhaA, rhaB, rhaD, srID, treC, xylA, xylB 
adi, aldH, air, ansA, ansB, argA, argB, argC, argD, argE, argF, argG, argH, argl, aroA, aroB, aroC, aroD, aroE, 
aroF, aroG, aroH, aroK, aroL, asd, asnA, asnB, aspA, aspC, avtA, cadA, car A, carB, cysC, cysD, cysE, cysH, 
cysl, cysJ, cysK, cysM, cysN, dadA, dadX, dapA, dapB, dapD, dapE, dapF, dsdA, gabD, gabT, gadA, gadB, 
gdhA, glk, glnA, gltB, gltD, glyA, goaG, hisA, hisB, hisC, hisD, hisF, hisG, hisH, hisl, ilvA, HvB, ilvC, ilvD, 
ilvE, ilvGJ, HvG_2, ilvH, ilvl, HvM, ilvN, kbl, IdcC, leuA, leuB, leuC, leuD, lysA, lysC, metA, metB, metC, 
metE, metH, metK, metL, pheA, proA, proB, proC, prsA, putA, sdaA, sdaB, serA, serB, serC, speA, speB, 
speC, speD, speE, speF, tdcB, tdh, thrA, thrB, thrC, tnaA, trpA, trpB, trpC, trpD, trpE, tynA, tyrA, tyrB, 
ygjG, ygjH, alaB (42), dapC (43), pat (44), prr (44), sad (45), methylthioadenosine nucleosidase (46), 
5-methylthioribose kinase (46), 5-methylthioribose-l-phosphate isomerase (46), adenosyl homocysteinase 
(47), L-cysteine desulfhydrase (44), glutaminase A (44), glutaminase B (44) 
i pyrimidine add, adk, amn, apt, edd, cmk, codA, ded, deoA, deoD, dgt, dut, gmk, gpt, gsk, guaA, guaB, guaC, hpt, 

idism mutT, ndk, nrdA, nrdB, nrdD, nrdE, nrdF, purA, purB, purC, purD, purE, purF, purH, purK, purL, purM, 

purN, purT, pyrB, pyrC, pyrD, pyrE, pyrF, pyrG, pyrH, pyrl, tdk, thyA, tmk, udk, udp, upp, ushA, xapA, yicP, 
CMP glycosylase (48) 

& cofactor metabolism acpS, bioA, bioB, bioD, bioF, coaA, cyoE, cysG, entA, entB, entC, entD, entE, entF, epd, folA, folC, folD, ME, 
folK, folP, gcvH, gcvP, gcvT, gltX, glyA, gor, gshA, gshB, hemA, hemB, hemC, hemD, hemE, hemF, hemH, 
hemK, hemL, hemM, hemX, hemY, ilvC, lig, IpdA, menA, menB, menC, menD, menE, menF, menG, metF, 
mutT, nadA, nadB, nadC, nadE, ntpA, pabA, pabB, pabC, panB, panC, panD, pdxA, pdxB, pdxH, pdxj, 
pdxK, pncB, purU, ribA, ribB, ribD, ribE, ribH, serC, thiC, thiE, thiF, thiG, thiH, thrC, ubiA, ubiB, ubiC, ubiG, 
ubiH, ubiX, yaaC, ygiG, nadD (49), nadF (49), nadG (49), panE (50), pncA (49), pncC (49), thiB (51), thiD (51), 
thiK (5 1 ), thiL (5 1 ), MM (5 1 ), thiN (51), ubiE (52), ubiF (52), arabinose-S-phosphate isomerase (22), 
pbosphopantothenate-cysteine ligase (50), phosphopantothenate-cysteine decarboxylase (50), 
phospho-pantetheine adenylyltransferase (50), dephosphoCoA kinase (50), NMN glycohydrolase (49) 
accA accB, accD, atoB, cdh, cdsA, els, dgkA, fabD, fabH, fadB, gpsA, ispA, ispB, pgpB, pgsA, psd, pssA, pgpA 



Transport processes 



(53) 

ddIA, ddlB, galF, galU, glmS, glmU, htrB, kdsA, kdsB, kdtA, IpxA, IpxB, IpxC, IpxD, mraY, msbB, murA, murB, 
murC, murD, murE, murF, murG, murl, rfaC, rfaD, rfaF, rfaG, rial, rfaJ, rfaL, ushA, glmM (54), IpcA (55), 
rfaE (55), tetraacyldisaccharide 4' kinase (55), 3-deoxy-D-manno-octulosonic-acid 8-phosphate 
phosphatase (55) 

araE, araF, araG, araH, argT, aroP, art/, artl, artM, artP, artQ, brnQ, cadB, chaA, chaB, chaC, cmtA, cmtB, 
coc/S; err, cycA, cysA, cysP, cysT, cysU, cysW, cysZ, dctA, dcuA, dcuB, dppA, dppB, dppC, dppD, dppF, fadL, 
focA, fruA, fruB, fucP, gabP, galP, gatA, gatB, gate, glnH, glnP, glnQ, glpF, glpT, gltl, gltK, gltL, gltP, gltS, 
gntT, gpt, hisJ, hisM, hisP, hisQ, hpt, kdpA, kdpB, kdpC, kgtP, lacY, lamB, livF, livG, livH, livJ, livK, livM, 
lldP, lysP, malE, malF, malG, malK, malX, manX, manY, manZ, melB, mgIA, mgIB, mgIC, mtIA, mtr, nagE, 
nanT, nhaA, nhaB, nupC, nupG, oppA, oppB, oppC, oppD, oppF, panF, pheP, pitA, pitB, pnuC, potA, potB, 
potC, potD, potE, potF, potG, potH, potl, proP, proV, proW, proX, pstA, pstB, pstC, pstS, ptsA, ptsG, ptsl, 
ptsN, ptsP, purB, putP, rbsA, rbsB, rbsC, rbsD, rhaT, sapA, sapB, sapD, sbp, sdaC, srlAJ, srlA_2, srIB, tdcC, 
tnaB, treA, treB, trkA, trkG, trkH, tsx, tyrP, ugpA, ugpB, ugpC, ugpE, uraA, xapB, xylE, xylF, xylG, xylH, 
fruF (56), gntS (57), metD (43), pnuE (49), scr (56) 



The in silico E. co// MG1655 metabolic genotype used hi 



mailable on the web: http://gcrg.ucsd.edu/downloads.html. 



(5-7). The metabolic capabilities of the in silico metabolic 
genotype were partially defined by mass balance constraints; 
mathematically represented by a matrix equation: 

S-v = 0. [1] 

The matrix S is the mxn stoichiometric matrix, where m is the 
number of metabolites and n is the number of reactions in the 
network. The E. coli stoichiometric matrix was 436 X 720. The 
vector v represents all fluxes in the metabolic network, including 
the internal fluxes, transport fluxes, and the growth flux. The 
optimal v vector was determined and defined the steady-state 
metabolic flux distribution. 



For the E. coli metabolic network, the number of fluxes was 
greater than the number of mass balance constraints; thus, there 
was a plurality of feasible flux distributions that satisfied the 
mass balance constraints (defined in Eq. 1), and the solutions (or 
feasible metabolic flux distributions) were confined to the 
nullspace of the matrix S. 

In addition to the mass balance constraints, we imposed 
constraints on the magnitude of each individual metabolic flux. 

a, =£ v, < ft, [2] 

The linear inequality constraints were used to enforce the 
reversibility/irreversibility of metabolic reactions and the max- 
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Fig. 1. The feasible solution set for a hypothetical metabolic reaction 
network. (A) The steady-state operation of the metabolic network is restricted 
to the region within a cone, defined as the feasible set (8). The feasible set 
contains all flux vectors that satisfy the physicochemical constrains (Eqs. 1 and 
2). Thus, the feasible set defines the capabilities of the metabolic network. All 
feasible metabolic flux distributions lie within the feasible set, and (6) in the 
limiting case, where all constraints on the metabolic network are known, such 
as the enzyme kinetics and gene regulation, the feasible set may be reduced 
to a single point. This single point must lie within the feasible set. 



imal metabolic fluxes in the transport reactions. The intersection 
of the nullspace and the region defined by the linear inequalities 
formally defined a region in flux space that we will refer to as the 
feasible set. The feasible set defined the capabilities of the 
metabolic network subject to the subset of cellular constraints, 
and all feasible metabolic flux distributions lie within the feasible 
set (see Fig. 1). However, every vector v within the feasible set 
is not reachable by the cell under a given condition because of 
other constraints not considered in the analysis (i.e., maximal 
internal fluxes and gene regulation). The feasible set can be 
further reduced by imposing additional constraints, and if all of 
the necessary details to describe metabolic dynamics are known, 
then the feasible set may reduce to a small region or even a single 
point (see Fig. 1). 

For the analysis presented herein, we defined a, = 0 for 
irreversible internal fluxes, and a,- = -°° for reversible internal 
fluxes. The reversibility of the metabolic reactions was deter- 
mined from the biochemical literature and is identified for each 
reaction on the web site. The transport flux for inorganic 
phosphate, ammonia, carbon dioxide, sulfate, potassium, and 
sodium was unrestrained (ai = -°° and # = <»). The transport 
flux for the other metabolites, when available in the in silico 
medium, was constrained between zero and the maximal level 



(0 < v, < v™"*). However, when the metabolite was not available 
in the medium, the transport flux was constrained to zero. The 
transport flux for metabolites that were capable of leaving the 
metabolic network (i.e., acetate, ethanol, lactate, succinate, 
formate, pyruvate, etc.) always was unconstrained in the outward 
direction. 

A particular metabolic flux distribution within the feasible set 
was found by using linear programming (LP). A commercially 
available LP package was used (undo, Lindo Systems, Chicago). 
LP identified a solution that minimized a particular metabolic 
objective (subject to the imposed constraints) (5, 25, 26), and was 
formulated as shown. Minimize — Z, where 



Z = Icfvt = (c-y). 



[3] 



The vector c was used to select a linear combination of metabolic 
fluxes to include in the objective function (27). Herein, c was 
defined as the unit vector in the direction of the growth flux, 
and the growth flux was defined in terms of the biosynthetic 
requirements: 

X d„,-X m — > Biomass , [4] 



where d„, is the biomass composition of metabolite X m (defined 
from the literature; ref. 28), and the growth flux is modeled as 
a single reaction that converts all of the biosynthetic precursors 
into biomass. 

Results 

FBA was used to examine the change in the metabolic capabil- 
ities caused by gene deletions. To simulate a gene deletion, the 
flux through the corresponding enzymatic reaction was re- 
stricted to zero. Genes that code for isozymes or genes that code 
for components of same enzyme complex were simultaneously 
removed (i.e., aceEF, sucCD). The optimal value of the objective 
(Zmmant) was compared with the "wild-type" objective (Z) to 
determine the systemic effect of the gene deletion. The ratio of 
optimal growth yields (Z mut ant/Z) was calculated (Fig. 2). 

Gene Deletions. E. coli MG1655 in silico was subjected to deletion 
of each individual gene product in the central metabolic path- 
ways [glycolysis, pentose phosphate pathway (PPP), tricarboxylic 
acid (TCA) cycle, respiration processes], and the maximal ca- 
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Fig. 2. Gene deletions in E. coli MG 1 655 central intermediary metabolism; maximal biomass yields on glucose for all possible single gene deletions in the central 
metabolic pathways. The optimal value of the mutant objective function (Z mu tant) compared with the "wild-type" objective function (Z), where Z is defined in 
Eq. 3. The ratio of optimal growth yields (,Z mMant /Z). The results were generated in a simulated aerobic environment with glucose as the carbon source. The 
transport fluxes were constrained as follows: j3gi U cose = 10 mmol/g-dry weight (DW) per h; Oxygen = 1 5 mmol/g-DW per h. The maximal yields were calculated 
by using FBA with the objective of maximizing growth. The biomass yields are normalized with respect to the results for the full metabolic genotype. The yellow 
bars represent gene deletions that reduced the maximal biomass yield to less than 95% of the in silico wild type. 
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pability of each in silico mutant metabolic network to support 
growth was assessed with FBA. The simulations were performed 
under an aerobic growth environment on minimal glucose 
medium. 

The results identified the essentia] (required for growth) 
central metabolic genes (Fig. 2). For growth on glucose, the 
essential gene products were involved in the three-carbon stage 
of glycolysis, three reactions of the TCA cycle, and several points 
within the PPP. The remainder of the central metabolic genes 
could be removed and E. coli in silico maintained the potential 
to support cellular growth. This result was related to the 
interconnectivity of the metabolic reactions. The in silico gene 
deletion results suggest that a large number of the central 
metabolic genes can be removed without eliminating the capa- 
bility of the metabolic network to support growth under the 
conditions considered. 

Are the in Silico Redundancy Results Consistent with Mutant Data? 

The in silico gene deletion study results were compared with 
growth data from known mutants. The growth characteristics of 
a series of E. coli mutants on several different carbon sources 
were examined and compared with the in silico deletion results 
(Table 2). From this analysis, 86% (68 of 79 cases) of the in silico 
predictions were consistent with the experimental observations. 

How Are Cellular Fluxes Redistributed? The potential of many in 
silico deletion strains to support growth led to questions regard- 
ing how the E. coli metabolic genotype deals with the loss of 
metabolic functions. The answer involves the degree of stoichi- 
ometric connectivity of key metabolites. For illustration, the flux 
redistributions to optimally support growth of a single mutant 
and a double mutant were investigated. 

The optimal metabolic flux distribution for the in silico wild 
type was calculated (Fig. 3). The constraints used in the LP 
problem are defined in the figure legend. The in silico results 
suggest that optimally the oxidative branch of the PPP was used 
to generate a large fraction of the NADPH (66% in silico: 
20-50% reported in the literature, ref. 29), and the TCA cycle 
produced NADH. The optimal flux distribution also suggested 
that the majority of the high-energy phosphate bonds were 
generated via oxidative phosphorylation and acetate secretion 
because of limitations of the oxygen supply. 

The in silico gene deletion results predicted that the optimal 
biomass yield of the zwf (glucose-6-phosphate dehydrogenase) 
in silico strain was slightly less than the wild type. The optimal 
flux distribution of the zwf in silico strain (Fig. 2) was calculated, 
and the NADPH was optimally generated through the transhy- 
drogenase reaction and an elevated TCA cycle flux. The PPP 
biosynthetic precursors were generated in the nonoxidative 
branch. This metabolic flux rerouting resulted in an optimal 
biomass yield that was 99% of the in silico wild type. 

The transhydrogenase (pnt) also was deleted in silico, creating 
an in silico double deletion mutant and eliminating an alternate 
source of NADPH. The double mutant still maintained growth 
potential. The optimal flux distribution (Fig. 2) used the isocit- 
rate dehydrogenase and the malic enzyme to produce NADPH. 
The optimal biomass yield of the double mutant was 92% of the 
in silico wild type. The FBA results were consistent with the 
experimental observations that the zwf strain (30) and the pnt' 
strain (29) are able to grow at near wild-type yields. Further- 
more, the zwf pnt' double mutant strain also has been shown to 

grow (jUmutant/Wikl type = 57%) (29). 

Discussion 

Extensive information about the molecular composition and 
function of several single-cellular organisms has become avail- 
able. A next important step will be to incorporate the available 
information to generate whole-cell models with interpretative 



Table 2. Comparison of the predicted mutant growth 
characteristics from the gene deletion study to published 
experimental results with single mutants 



Gene gk gl succ ac Reference 



aceA +/+ +/+ -/- (58) 

aceB -I- (58) 

aceEF* -/+ (60) 
ackA +/+ (61) 

acn -I- -I- (58) 

acs +/+ (61) 

cyd +/+ (62) 

cyo +/+ (62) 

eno f -/+ -/+ -/- -/- (30) 

f"ba» -/+ (30) 

fbp +/+ -I- -/- -/- (30) 

frd +/+ +/+ +/+ (60) 

gap -I- -I- -I- -I- (30) 

glk +/+ (30) 

gltA -I- -I- (58) 

gnd +/+ (30) 

idh -I- -I- (58) 

mdh" +/+ +/+ +/+ (63) 

ndh +/+ +/+ (59) 

nuo +/+ +/+ (59) 

pflr* -/+ (30) 

pgi* +/+ +/- +/- (30) 

pgk -I- -I- -I- -I- (30) 

pgl +/+ (30) 

pntAB +/+ +/+ +/+ (29) 

ppcS ±/+ -/+ +/+ (63,64) 

pta +/+ (61) 

pts +/+ (30) 

pyk +/+ (30) 

rpi -I- -I- -I- -I- (30) 

sdhABCD +/+ -/- -/- (58) 

sucAB +/+ -/+ -/+ (60) 

tktAB -I- (30) 

tpi** -/+ -I- -I- -I- (30) 

unc +/+ ±/+ -/- (66-68) 

zwf +/+ +/+ +/+ (30) 



Results are scored as + or - meaning growth or no growth determined 
from in vivo/in silico data. The ± indicates that suppressor mutations have 
been observed that allow the mutant strain to grow. In 68 of 79 cases the in 
silico behavior is the same as the experimentally observed behavior, glc, 
glucose; ac, acetate; gl, glycerol; succ, succinate. 

*The in vivo aceAE strain is able to grow under anaerobic growth conditions 
by using the pyruvate formate lyase. 

1 The in silico pfk strain is able to grow by increasing the PPP flux ~ 5x and 
using the pps gene product to overcome PEP deficiency. 

*The in silico pgi strain is unable to grow with glycerol or succinate as the 
carbon source because it is unable to synthesize glycogen and one carbohy- 
drate component in the lipopolysaccharide. These are likely nonessential 
components of the biomass. 

5 The grow on glycerol and glucose is possible through the utilization of the 
glyoxylate bypass. Constitutive mutations in the glyoxylate bypass can sup- 
press the ppc phenotype. 

^The in silico eno strain is able to grow by the synthesis and degradation of 

"There is evidence that fba hasan inhibitory effect on stable RNA synthesis (65). 
Such an inhibition cannot be predicted by FBA. 

"The inability of tpi mutants to grow on glucose may be related to the 
accumulation of dihydroxyacetone phosphate, which leads to the forma- 
tion of the bactericidal compound methylglyoxal (30). 

tf Very slow growth on glycerol and succinate. 

and predictive capability. Herein, we have taken a step in that 
direction by using a set of constraints on cellular metabolism on 
the whole-cell level to analyze the metabolic capabilities of the 
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Fig. 3. Rerouting of metabolic fluxes. (Black) Flux distribution for the 
complete gene set. (Red) zwf mutant. Biomass yield is 99% of the results for 
the full metabolic genotype. (Blue) zwf pnt mutant. Biomass yield is 92% of 
the results for the full metabolic genotype (see text). The solid lines represent 
enzymes that are being used, with the corresponding flux value noted. The 
fluxes [substrates converted/h per g-dry weight (DW)] were calculated by 
using FBA with the input parameters of glucose uptake rate (p g i UC ose = 6.6 
mmol glucose/h per g-DW) and oxygen uptake rate (J3 01!ygen = 12.4 mmol 
oxygen/h per g-DW) (41). 



extensively studied bacterium E. coli. We have calculated the 
optimal metabolic network utilization with a FBA. The in silico 
results, based only on stoichiometric and capacity constraints, 
were consistent with experimental data for the wild type and 
many of the mutant strains examined. 

The construction of comprehensive in silico metabolic maps 
provided a framework to study the consequences of alterations 
in the genotype and to gain insight into the genotype-phenotype 
relation. The stoichiometric matrix and FBA were used to 
analyze the consequences of the loss of a gene product function 
on the metabolic capabilities of E. coli. The results demonstrated 
an important property of the E. coli metabolic network, namely 
that there are relatively few critical gene products in central 
metabolism. The nonessential genes in several organisms have 
been found experimentally on a genome scale (31, 32), which 
opens up the opportunity to critically test the in silico predictions. 
The in silico analysis also suggests that although the ability to 
grow in one defined environment is only slightly altered the 
ability to adjust to different environments may be diminished 
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(33). Therefore, the in silico analysis provides a methodology for 
relating the specific biochemical function of the metabolic 
enzymes to the integrated properties of the metabolic network. 

The in silico analysis presented herein is not the typical 
metabolic modeling; more appropriately, the analysis can be 
thought of as a constraining approach. This approach defines the 
"best" the cell can do and identifies what the cell cannot do, 
rather than attempting to predict how the cell actually will 
behave under a given set of conditions. To accomplish this, we 
have used a set of physicochemical constraints for which there is 
reliable information available, in particular the stoichiometric 
properties. FBA does not directly consider regulation or the 
regulatory constraints on the metabolic network. 

The results of FBA can be interpreted in a qualitative or a 
quantitative sense. At the first level we can ask whether a cell is 
able to grow under given circumstances and how a loss of the 
function of a gene product influences this ability. The results 
presented herein fall into this category. Quantitative predictions 
would hold true if the cell optimized its growth under the growth 
conditions considered. Therefore, when applying LP to predict 
quantitatively the optimal metabolic pathway utilization, it is 
assumed that the cell has found an "optimal solution" for 
survival through natural selection, and we have equated survival 
with growth. Although E. coli may grow optimally in defined 
media, one should not expect that optimizing growth is the 
governing objective of the cell under all growth conditions. For 
example, the regulatory mechanisms can only evolve to stoichi- 
ometric optimality in a condition to which the cell has been 
exposed. Furthermore, the growth behavior of mutant strains is 
unlikely to be optimal. However, FBA can still be used to 
delineate the metabolic capabilities of mutant cells based on 
constraining features, because both wild-type and mutant cells 
must obey the physicochemical constraints imposed. 

The constraints on the system accurately reflect the steady- 
state capabilities of the metabolic network, but does the calcu- 
lated optimal flux vector in the feasible set accurately reflect the 
behavior of the actual metabolic network? It has been shown that 
in a minimal media the metabolic behavior of wild-type E. coli 
is consistent with stoichiometric optimality (34). Furthermore, 
more detailed and critical experimental results are consistent 
with the hypothesis that E. coli does optimize its growth in 
acetate or succinate minimal media (33). Taken together these 
results call for critical experimental investigation to evaluate the 
hypothesis that stoichiometric and capacity constraints are the 
principal constraints that limit E. coli maximal growth. Even 
though growth and metabolic behavior in minimal media are 
consistent with FBA results, one still must determine the gen- 
erality of optimal performance. The call for critical experimen- 
tation is particularly timely, given the increasing number of 
genome scale measurements that are now possible through 
two-dimensional gels (35, 36) and DNA array technology (37, 
38). Furthermore, the ability to precisely remove ORFs can be 
used to design critical experiments (39). The in silico model can 
be used to choose the most informative knockouts and to design 
growth experiments with the knockouts. 

At the present time, the annotation of the E. coli genome is 
incomplete, and about one-third of its ORFs do not have a 
functional assignment. Thus, the metabolic genotype studied 
here may lack some metabolic capabilities that E. coli possesses. 
The biochemical literature also was used to define the in 
silico metabolic genotype, and given the long history of E. coli 
metabolic research (20), a large percentage of the E. coli 
metabolic capabilities likely have been identified. However, if 
additional metabolic capabilities are discovered (40), the E. coli 
stoichiometric matrix can be updated, leading to an iterative 
model building process. Additionally, the in silico analysis can 
help identify missing or incorrect functional assignments by 

Edwards and Palsson 



identifying sets of metabolic reactions that are not connected to 
the metabolic network by the mass balance constraints. 

The ability to analyze, interpret, and ultimately predict cellular 
behavior has been a long sought-after goal. The genome se- 
quencing projects are defining the molecular components within 
the cell, and describing the integrated function of these molec- 
ular components will be a challenging task. The results presented 
herein suggest that it may be possible to analyze cellular me- 
tabolism based on a subset of the constraining features. Con- 
tinued prediction and experimental verification will be an 
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integral part in the further development of in silico strains. 
Deciphering the complex relation between the genotype and the 
phenotype will involve the biological sciences, computer 
science, and quantitative analysis, all of which must be included 
in the bioengineering of the 21st century. 
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of p2J-Luc, bax-luc, COPi-Luc, COMmut-Luc or NS-Luc, and 10 ng of pCMV-0-Gal, 
" or without increasing amounts (0.5, 1 and 2 ng) of pCMV-Flag-COPl or pCMV- 
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The flood of high-throughput biological data has led to 
expectation that computational (or in silico) models can be 
used to direct biological discovery, enabling biologists to recon- 
cile heterogeneous data types, find inconsistencies and system- 
atically generate hypotheses 1 " 3 . Such a process is fundamentally 
iterative, where each iteration involves making model predic- 
tions, obtaining experimental data, reconciling the predicted 
outcomes with experimental ones, and using discrepancies 
update the in silico model. Here we have reconstructed, on 
basis of information derived from literature and databases, the 
first integrated genome-scale computational model of a tr 
scriptional regulatory and metabolic network. The model 
accounts for 1,010 genes in Escherichia coli, including 104 
regulatory genes whose products together with other stimuli 
regulate the expression of 479 of the 906 genes in the reo 
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structed metabolic network. This model is able not only to 
predict the outcomes of high-throughput growth phenotyping 
and gene expression experiments, but also to indicate knowledge 
gaps and identify previously unknown components and inter- 
actions in the regulatory and metabolic networks. We find that a 
systems biology approach that combines genome-scale experi- 



mentation and computation can systematically generate hypoth- 
eses on the basis of disparate data sources. 

We first validated the model, or 'in silico strain' of E. coli 
(iMC1010 vl ; see ref. 4 for conventions for naming in silico strains), 
against a data set of 13,750 growth phenotypes 5 obtained from the 
ASAP database 6 , and then used this genome-scale model to select 
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Figure 1 Growth phenotype study, a, Comparison of high-throughput phenotyping array 
data (Exp) with predictions for the £ coli network, both considering regulatory constraints 
(Reg) and ignoring such constraints as a control (Met). Each case is categorized by 
comparison type (Exp/Met/Reg), . and results are listed as '+' (predicted or observed 
growth), '-' (no growth) or 'n' (for cases involving a regulatory gene knockout not 
predictable by the Met model). The comparisons are further divided into four subgroups 
represented by different colours, b, Chart showing individual results for each knockout 
under each environmental condition, with results categorized and coloured as in a. The 
environments involve variation of a carbon or nitrogen source and are further divided into 
subgroups: AA, amino acid or derivative; CM, central metabolic intermediate; NU, 
nucleotide or nucleoside; SU, sugar; OT, other. The knockout strains are also divided by 



functional group: A, amino acid biosynthesis and metabolism; B 
cofactors, prosthetic groups and carriers; C, carbon compound catabolism; P, cell 
processes (including adaptation and protection); S, cell structure; M, central intermediary 
metabolism, E; energy metabolism; F, fatty acid and phospholipid metabolism; 
N, nucleotide biosynthesis and metabolism; R, regulatory function; T, transport and 
binding proteins; U, unassigned. Each environment and knockout strain is associated with 
a fraction of agreement (FA) between regulatory model predictions and observed 
phenotypes, as shown in the bar charts to the right and below, c, Table showing all 
environments or knockout strains for which FA < 0.60. Of these substrates or knockout 
strains, 1 8 point to uncharacterized metabolic or regulatory capabilities in this organism, 
as indicated (see Supplementary Information for a case-by-case analysis). 
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transcription factors for prospective gene knockout studies. Com- 
parison with the growth phenotypes showed that experimental and 
computational outcomes agreed in 10,828 (78.7%) of the cases 
examined, which is roughly the same success rate achieved in 
previous studies in E. coli and yeast that considered only a few 
hundred phenotypes 7 " 9 . In addition, 2,512 (18.3%) of the cases were 
predicted correctly only when regulatory effects were incorporated 
into the metabolic model (see Supplementary Information for 
details). 

The comparisons in this study identified several substrates and 
knockout strains whose growth behaviour did not match predic- 
tions (Fig. 1). Further investigation of these conditions and strains 
led to the identification of five environmental conditions in which 
dominant, as yet uncharacterized, regulatory interactions actively 
contribute to the observed growth phenotype, and five environ- 



mental conditions and eight knockout strains that highlight 
uncharacterized enzymes or non-canonical pathways that are pre- 
dicted to be used by the organism (Fig. 1; a detailed analysis of the 
discrepancies is provided in the Supplementary Information). 

We wanted to determine the utility of this model-driven 
approach in elucidating transcriptional regulatory networks. A 
previous study, which evaluated the consistency between existing 
gene expression data sets and the known transcriptional regulatory 
network of E. coli, identified the response to oxygen deprivation as a 
partially consistent module 10,11 . We therefore targeted this part of 
the transcriptional regulatory network for further network charac- 
terization. Six strains with knockouts of key transcriptional regu- 
lators in the oxygen response (AarcA, AappY, Afnr, AoxyR, AsoxS 
and the double knockout AarcAAfnr) were constructed. The 
messenger RNA expression profiles of these strains, as well as the 
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Figure 2 Characterization of the regulatory network related to the aerobic-anaerobic 
a, The locus numbers, gene names and the log2 ratio (L2R) of gene expression 
(aerobic to anaerobic) are shown for all model genes with either predicted or observed 
changes in expression (genes were divided into the same functional groups as in Fig. 1). 
The L2R values are shaded depending on the magnitude of the expression shift, and those 
enclosed by a box indicate a statistically significant change in expression (P < 0.007, 
FDR < 5%). Comparisons between the experimental data and model predictions are also 
shown, where v1 (/MC1 01 0 v1 ) and v2 (/MC1 01 <f) designate the model used in the 
predictions. Filled and open symbols indicate model predictions and experimental data, 
respectively; rectangles indicate no change in gene expression; triangles indicate a 
change in expression, as well as the direction of change (upregulated or downregulated). 



b, Comparison of the predicted and observed expression changes for the v1 and v2 
models. A question mark indicates either that the given gene was not included in the 
model or that no expression data were obtained for a given shift; other symbols are the 
same as in a. c, Systematic perturbation analysis was used to determine the transcription 
factors responsible for the expression change. The transcription factors knocked out in the 
six strains are shown on top. Each row indicates a pattern of knockout strains in which 
differential expression was abolished. The number of genes that show this pattern is 
indicated on the right. Thus, the first row indicates that for 73 of the 437 genes that 
showed differential expression in the wild-type strain (or 20 of the 151 genes 
accounted for by the model), the observed differential expression was abolished only in 
the - AarcAAfnr knockout strain. 



©2004 Nature Publishing Group 



letters to nature 



wild-type strain, were measured in aerobic and anaerobic glucose 
minimal medium conditions. The data were analysed 12 in the 
context of iMC1010 vl predictions to identify new interactions in 
the regulatory network (Fig. 2). 

Expression profiling of the wild-type strain identified 437 genes 
that experienced a significant change in transcription in response to 
oxygen deprivation (t-test, multiple testing corrected to give a false 
discovery rate (FDR) of less than 5%); of these, 151 genes were 
included in iMC1010 vl . Computationally, 75 genes were predicted 
by iMC1010 vl to show differential expression in response to oxygen 
deprivation. These 75 genes could be divided into three categories: 
23 agreed with measured expression changes; 24 had a predicted 
expression change that was either not found to be statistically 
significant in the experimental data (23/24) or in a direction 
opposite to that of the experimental data (1/24); and for 28 genes 
there were no expression data available (transcript abundance was 
determined to be 'absent' for two or more of the replicates). Thus, of 
the 47 (=23 -I- 24) differentially expressed genes that could be 
compared between the model computation and experiment, 23 
(or 49% accuracy) agreed. Considering the overall number of genes 
in the model for which there were experimental data, the overlap 
(23) between the sets of predicted (47) and experimentally detected 
(151) differentially expressed genes is significant in comparison to a 
model that would randomly predict expression changes (P < 0.005 
on the basis of a cumulative binomial distribution). There were 
151 genes that were differentially expressed and included in the 
model; however, with only 23 (or 15% coverage) correctly com- 
puted, there is much room for expanding the transcriptional 
regulatory network in iMC1010 v1 on the basis of the experimental 
data (Fig. 3). 

To understand which transcription factors are involved in regu- 
lating these differentially expressed genes after oxygen deprivation, 
we compared the gene expression data for the wild-type and each 
knockout strain separately. Using two-way analysis of 




IMCIOIO" 2 . 

• Phenotypic predictions ' . 

- 79% (10,833/13,750) accuracy 
•: Expression predictions : - 
-98% (100/102) accuracy 

- 66% (100/151) coverage 



Figure 3 Biological network elucidation by a model-centric approach. Metabolic and 
regulatory networks may be expanded by using high-throughput phenotyping and gene 
expression data coupled with the predictions of a computational model. If model 
predictions are consistent with experimental observations, the network is adequately 
characterized. If not, the model identifies a knowledge gap and may be used to update, 
validate and generate hypotheses about organism function. Accuracy refers to the 
percentage of model predictions that agree with experimental data; coverage indicates 
the percentage of experimental changes predicted correctly by the model. 



(ANOVA), we could determine whether the differential expression 
was significantly altered in the knockout strain as compared with 
the wild type. A large portion of the expression changes observed for 
the wild-type strain were not significantly affected in any of the 
knockout strains (195/437 or 44.6% of genes overall, 63/151 or 
41.7% of genes in the model, FDR < 5%), suggesting that none of 
the five transcription factors studied here regulates the expression of 
these genes or that combinatorial interactions between multiple 
transcription factors are involved in regulation. For the remainder 
of the genes, differential expression was abolished in one or more of 
the knockout strains (Fig. 2c). 

The ANOVA-based identification of transcription factors that 
influence differential expression of specific genes enabled us system- 
atically to rewrite, relax or remove various regulatory rules in the 
model to resolve the discrepancies between «MC1010 vl and the 
experimentally determined wild-type differential gene expression. 
For many (8 1 ) of the genes, a regulatory rule already existed and had 
to be reconciled with our new data to accommodate the newly 
determined transcription factor dependencies. For genes where 
none of the knockouts abolished differential expression, we simply 
based a new regulatory rule on the presence of oxygen rather than a 
transcription factor (39 genes). By contrast, for genes where a 
change in expression was predicted but not observed, we removed 
oxygen dependency from the existing regulatory rule (23 genes). In 
addition, for 12 genes the predicted expression changes agreed with 
the observed expression in the wild type, but our knockout 
perturbation analysis indicated that the transcription factors 
involved in the regulation differed from previously reported data 
and the model needed to be changed (all new regulatory rules are 
detailed in the Supplementary Information). 

The updated model (iMC1010 v2 ) was used to recalculate all of the 
predictions for both the aerobic and anaerobic expression data and 
the high-throughput phenotyping arrays. Note that z'MC1010 v2 
accounts for the same genes as iMC1010 vl but has different 
regulatory interactions among the gene products and oxygen as 
an environmental variable. We found agreement between model 
predictions and the gene expression data to be substantially higher 
using the zMC1010 vr model, as expected (Fig. 2c). Specifically, 100 
of the 151 expression changes were correctly computed with 
iMC1010 v2 , and the number of false-positive predictions (Fig. 2, 
yellow boxes) was reduced to zero. In resolving many of the cases of 
unpredicted differential expression (Fig. 2, orange boxes), we found 
that implementation of the ANOVA-derived rule resulted in the 
inability of the wild-type or knockout in silico strain to grow 
aerobically or anaerobically on glucose, or under other conditions 
where growth had been previously established (for example, wild- 
type and knockout strain average growth rate under aerobic 
conditions, 0.68 ± 0.04 per hour; anaerobic, 0.43 ± 0.07 per 
hour) . Such cases may be thought of as an 'overfit' of the microarray 
data. Accordingly, we relaxed the regulatory rule for these genes (42 
in total) to allow for a correct phenotype prediction. Comparisons 
for the high-throughput phenotyping data revealed very little 
difference from Fig. 1 (only 11 out of 13,750 cases were affected; 
see Supplementary Information). 

The iterative modification of the regulatory rules led to three 
main observations. First, some of the results of the knockout 
perturbation analysis are complex enough to make boolean rule 
formulation difficult. For example, the interplay of Fnr and ArcA 
can lead to complex behaviours where the expression change 
observed in wild type is abolished in the AarcA or the Afnr strains, 
but not in the AarcAAfiir strain. Such complex interplay among 
transcription factors can lead to specialized expression changes, as 
observed in the cydAB response to anaerobic, microaerobic and 
aerobic conditions 13, 14 . 

Second, in revising regulatory rules for transcription factors we 
found that whereas in some cases, such as arcA, expression of a 
regulatory protein correlates positively with its activity, in several 
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es, including fnr, betl and fur among others, the mRNA level of a 
regulatory gene is reduced when the protein is in fact activated. For 
example, under anaerobic conditions when Fnr is known to be 
active 11 , its expression is significantly reduced. Such behaviour, 
underscored by similar observations of mRNA transcript levels 
and corresponding protein product abundance in yeast 15 , suggests 
that the identification of regulatory networks, and transcription 
factors involved in regulation in particular, will not be accomplished 
by the determination of co-regulated gene sets alone. 

Third, many of these gene expression changes involve complex 
interactions and indirect effects. Transcription factors may be 
affected, for example, by the presence of fermentation by-products 
r the build up of internal metabolites. Such effects would be 
extremely difficult to identify or account for without a compu- 
tational model. 

In summary, we find that the reconciliation of high-throughput 
data sets with genome-scale computational model predictions 
enables systematic and effective identification of new components 
and interactions in microbial biological networks. Our study 
illustrates only the first round of an iterative model building strategy 
where an initial model based on literature-derived information 
(iMC1010 vl ) is used to design informative experiments and then 
updated on the basis of the new experimental data obtained 
(iMC1010 v2 ). Another round of perturbation experiments will 
1 to MC1010 v3 , and so on. We expect that after an effort of 
le years and many iterations of this process, regulatory network 
elucidation for E. coli will be essentially complete. □ 
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Annotated genome sequences 1,2 can be used to reconstruct whole- 
cell metabolic networks". These metabolic networks can be 
modelled and analysed (computed) to study complex biological 
functions 7 " 11 . In particular, constraints-based in siiico models 12 
have been used to calculate optimal growth rates on common 
carbon substrates, and the results were found to be consistent 
with experimental data under many but not all conditions 1314 . 
Optimal biological functions are acquired through an evolution- 
ary process. Thus, incorrect predictions of in siiico models based 
on optimal performance criteria may be due to incomplete 
adaptive evolution under the conditions examined. Escherichia 
coli K-12 MG1655 grows sub-optimally on glycerol as the sole 
carbon source. Here we show that when placed under growth 
selection pressure, the growth rate of E. coli on glycerol repro- 
ducibly evolved over 40 days, or about 700 generations, from a 
sub-optimal value to the optimal growth rate predicted from a 
whole-cell in siiico model. These results open the possibility of 
using adaptive evolution of entire metabolic networks to realize 
metabolic states that have been determined a priori based on in 
siiico analysis. 

Predictive whole-cell metabolic models can be developed using a 
constraints-based modelling procedure 15 " 18 . As an alternative to 
detailed theory-based models, constraints-based models use the 
successive imposition of governing constraints (such as mass con- 
servation, thermodynamics, capacity and nutritional environment) 
to eliminate network functions that exceed the governing con- 
straints. Mathematically this procedure defines a solution space 
containing all possible metabolic network functions that satisfy the 
governing constraints. Each particular solution in this space corre- 
sponds to a particular state of the metabolic network and therefore a 
potential behaviour of the cell. Within the solution space defined by 
the governing constraints, the optimal use of the metabolic network 
to support growth can be found among all possible solutions using 
linear optimization 1 ^ 19 . However, a single optimal growth con- 
dition is of limited interest and a phenotype phase plane (PPP) 
analysis has been developed to obtain a broad understanding of a 
metabolic network's optimal properties 20,21 . The PPP analysis evalu- 
ates the optimal properties of a metabolic network under a range of 
environmental conditions (see Methods) and has been used to show 
that the growth of E. coli is consistent with the optimal use of its 
metabolic network under several denned growth conditions 12 " 1 ' 1 . 

It is not known whether optimal growth is observed on all 
substrates, and if not, whether adaptive evolution towards optimal 
growth can be achieved. Furthermore, if such adaptive evolution 
towards the optimal behaviour occurs, does the endpoint corre- 
spond with a priori calculations? To address these issues, we 
examined prolonged exponential growth of E. coli K-12 on several 
substrates (acetate, succinate, malate, glucose and glycerol). All 
calculations presented here were made with a previously formulated 
large-scale E. coli metabolic model 12,14 , and the model was not 
adjusted or 'fitted' to the data described. 

Batch growth experiments were done using malate as the sole 
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carbon source with a range of substrate concentrations (0.25- 
3gl _1 ) and temperatures (29-37 °C) to vary the malate uptake 
rate (MUR). The MUR, oxygen uptake rate (OUR) and growth rate 
were measured. The measured MUR and OUR data were optimal, as 
defined by the line of optimality (LO) in the PPP (Fig. la). The 
optimal growth rate of E. coli was calculated for all combinations of 
the MUR and OUR and displayed as a surface over the PPP (Fig. lb). 
The experimentally determined growth rates were on the edge of the 
colour-coded solution space that corresponds to the LO (Fig. lb). 
Hence the optimal growth performance of E. coli K-12 on malate 
was predicted a priori by using PPP. The results for growth with 
malate as the sole carbon source were in agreement with previous 
observations of E. coli metabolism for growth on succinate or 

A natural question arises: is the optimal performance on malate 
stable over prolonged periods of time? To address this question, 
adaptive evolution of E. coli on malate was studied for 500 
generations. The adaptation resulted in a 19% increase in growth 
rate. However, the MUR and OUR also increased and maintained 
metabolic operation on the LO (Fig. 1). Similar adaptive evolution 
experiments on acetate and succinate resulted in an increased 
growth rate (20% and 17%, respectively) (Fig. 2). Both the oxygen 
and substrate uptake rates increased concomitantly to maintain 
optimal growth as defined and predicted by the PPP analysis. 

The growth rate of E. coli using glucose as the carbon source was 
also increased by prolonged exponential growth (Figs 2 and 3). 
Before adaptive evolution on glucose the cellular growth rate, OUR, 
and glucose uptake rate (GUR) were experimentally determined 
over a range of glucose concentrations and temperatures. The 
experimentally determined values for the GUR and OUR corre- 
sponded to points on the LO or slightly in phase 2 (the acetate 
overflow region) of the PPP (ref. 2 1 ) (Fig. 3a) . The predicted acetate 
secretion in phase 2 was experimentally observed and the measured 
growth rates were on the surface of the solution space near the edge 
corresponding to the LO (data not shown). E. coli was subsequently 
kept in sustained exponential growth over 500 generations (Fig. 3b, 
c) . The growth rate increased by 17%, as shown by movement of the 
experimental data points within phase 2 on the surface towards 
higher growth rates. Thus, as with malate, succinate and acetate, the 
growth rate of E. coli with glucose as the carbon source could be 
slightly increased with the substrate and oxygen uptake rates 
moving in phase 2 with some acetate overflow. It was also noted 
that evolutionary adaptation maintained metabolic operation on 
the surface of the three-dimensional PPP, as predicted by the 
physicochemical constraints on the metabolic network. The meta- 
bolic operation in the phase 2 provided an increased growth rate 



with a reduced yield (relative to the LO). 

We determined the growth performance over a range of glycerol 
concentrations and temperature. Unlike growth on malate or 
glucose, the experimental data points were scattered throughout 
phase 1 far from the LO (Fig. 4b), indicating sub-optimal growth of 
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Figure 1 Growth of £ co//K-1 2 on malate. a, The malate-oxygen phenotype phase plane 
(PPP) Phase 1 is characterized by metabolic futile cycles, whereas phase 2 is 
characterized by acetate overflow metabolism. The line of optimality (L0, in red) separates 
phases 1 and 2 (ref. 21 .) Data points (open circles) represent malate concentrations 
ranging from 0.25-3g I -1 ; and temperatures ranging from 29-37 °C. The two data 
points in blue represent the starting point (day 0) and endpoint (day 30) of adaptive 
evolution respectively, at a malate concentration of 2 g I - 1 and a temperature of 37°C. 
These data points represent a span of 500 generations, b, Three-dimensional 
representation of growth rates. The /and yaxes represent the same variables as in a. The 
^axis represents the cellular growth rate (IT 1 ). OUR, oxygen uptake rate; MUR, malate 
uptake rate. 
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Figure 2 Growth rate during adaptive evolution on glucose, malate, succinate and The increases In growth rate over time were as follows': glucose (1 8%), malate (21 %), 
acetate. Growth conditions were kept constant at a temperature of 37 °C and a substrate succinate (1 7%) and acetate (20%). The number of generations for each adaptive 
concentration of 2 g I" 1 . We measured growth rate in the exponential phase of growth, evolution was: glucose (500), malate (500), succinate (1 ,000) and acetate (700). 
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wild-type E. coli K-12 on glycerol, consistent with previous obser- 

We studied E. coli adaptive growth using glycerol as the sole 
carbon source (2 gl -1 ), by serial transfer, at a temperature of 30 °C, 
and with sufficient oxygenation. Growth rate, glycerol uptake rate 
(Gl-UR) and OUR were measured every ten days. Over a 40-day 
period an evolutionary path (El) was observed (Fig. 4c). Pheno- 
typic changes were traced in phase 1 , eventually converging towards 
the LO. During this 40-day period, the growth rate more than 
doubled from 0.23h _1 to 0.55h _1 (Fig. 4a). The substrate uptake 
and growth rate data obtained under various growth conditions 
after adaptive evolution were near the LO (Fig. 4d). The evolved 
strain attained near-optimal growth on glycerol as defined by the 
in silico predictions. A second, independent adaptation experiment 
gave a similar but non-identical evolutionary trajectory (E2), 
converging near the same endpoint (Fig. 4c). Finally, a third 
independent adaptation experiment (E3) was done with a different 
initial starting point within phase 1. E3 was done at 37 °C and a 
glycerol concentration of 2 g 1 _ 1 . The adaptation of E. coli to growth 




Figure 3 Growth of E C0//K-1 2 on glucose, a, The glucose-oxygen PPP. Like the malate- 
oxygen PPP, phase 1 represents sub-optimal growth and phase 2 is characterized by 
acetate overflow metabolism. The LO is shown in red. b, GUR plotted against OUR along 
with experimental values for adaptive evolution experiments. Open circles represent 
measurements during the adaptive evolutionary process, whereas blue circles indicate 
the beginning and end of evolution, c, Three-dimensional rendering of computed growth 
rate and the experimental data (from a and b). The xand y axes represent the GUR and 
OUR. The zaxis represents the cellular growth rate (h"'). 



on glycerol at 37 °C resulted in motion towards the LO and the 
growth rate increased by about 30% (Fig. 4a). The final growth rate 
of the E3 strain was consistent with the in silico predictions with 
respect to the Gl-UR, OUR and the growth rate. 

To assess the stability of the endpoint of the adaptive evolution, 
we extended the cultivation on glycerol for an additional 300 
generations, or 20 days for the El and E2 strains. The data indicated 
no further change in growth (Fig. 4a). On the sixtieth day of the 
experiment the El and E2 strains exhibited growth on the LO under 
various growth conditions, reaffirming optimal growth behaviour 
and the stability of the phenotype (Fig. 4e). 
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Figure 4 Growth of £ coli K-12 on glycerol, a, Change in growth rate with time for three 
adaptive evolution experiments: trajectories E1 , E2, and E3. E1 and E2 were performed at 
30 °C and E3 at 37 °C. The glycerol concentration was kept constant at 2 g I " 1 during E1 , 
E2 and E3. b, The PPP pre-evolution. The L0 is shown in red. The range ot glycerol 
concentrations used was 0.25-2 gl -1 . c, The PPP during adaptive evolution. 
Experimental values for E1 are indicated in blue, and for E2 they are indicated in green, 
and for E3 in red. The starting point of evolution for E1 and E2 is indicated in black (day 0). 
d, The PPP after 40 days (about 700 generations) of evolution. The range of glycerol 
concentrations used was 0.25—2 g I -1 , e, The PPP after 60 days (1 ,000 generations) of 
evolution. The range of glycerol concentrations used was 0.25-2 g I -1 . Data points were 
obtained using the E1 (blue) and E2 (green) strains. Gl-UR, glycerol uptake rate. 
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Selection pressure is expected to result in optimal perfo 
through an evolutionary process. Optimal growth of E. coli on 
acetate, succinate, malate and glucose is consistent with the predic- 
tions of whole-cell in silico models. The strain used here has 
presumably never had to compete for survival using glycerol as 
the sole carbon source and thus initially utilized this carbon source 
non-optimally. However, adaptive evolution on glycerol resulted in 
the a priori calculated optimal growth that was based on the 
constraints placed on the E. coli metabolic network. The adaptive 
evolutionary process had a reproducible and predictable endpoint. 

This study opens up several possibilities. First, it may now be 
possible to specify optimal network properties in silico and achieve 
them through an adaptive evolutionary process or in combination 
with a series of other methodologies. In silico design of micro- 
organisms could be used to improve their metabolic abilities, 
production efficiency and/or operational longevity. Second, 
changes in mRNA expression levels and DNA sequences can now 
be monitored as cells progress along a defined evolutionary path. 
Such experiments may yield valuable insight into the molecular 
design of complex control circuits and their adaptation during 
evolution. The combination of in silico and experimental biology 
introduced here may make a new series of biological designs 
attainable. 

Constraint-based computational models use an optimization- 
based procedure to predict cellular states. It is assumed that this 
optimal state, within the governing constraints, is found by altering 
the numerical values of the kinetic and regulatory constants 
through a 'trial-and-error' process. This feature of constraint- 
based models is a significant departure from other types of math- 
ematical models of cell function, where these parameters are treated 
as being time-invariant. Thus constraint-based models directly 
account for the fundamental nature of adaptive evolution. The 
adaptive evolutionary path itself cannot be predicted; however, the 
final " ~ 



Adaptive evolution 



Strains and media 

The E. coli K-12 MG1655 annotated genome sequence and the biochemical literature were 
used to construct the in silico E. coli strain 1,3,12 . We simulated the metabolic capabilities as 
previously described with the objective of maximizing growth 12,14,11 . Both growth and 
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Analytical procedures 

Cellular growth rate was monitored by measuring the absorbance (A, or optical density) at 
600 nm and 420 nm and by cell counts (Coulter Electronics). The doubling time was 
calculated from the growth rate: t d = ln(2)/fi. Absorbance to cellular dry weight 

dried at 75 °C to a constant weight; and (2) 25-50 ml samples (taken throughout the 

chromatography (HPLC) (Rainin Instruments). Anamtaex HPX-87H ion exchange 
carbohydrate-organic acid column (Bio-Rad Laboratories) (65 °C) was used with 
:d 5 mM sulphuric acid as the mobile phase and ultraviolet detection 
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incubator. Serial transfers were made during the exponential phase of growth at mid-log- 
phase (A 6[)0 nm = 0.55) using an adjusted inoculum volume based on the growth rate of 
the culture. On a daily basis, the growth rate, time of inoculation, A 6n n nm of the culture, 

cultures were stored on a daily basis. The culture was tested weekly for pH and phenotyped 

contamination. No discernible differences in colony morphology or si 
contamination were observed during these experiments. 

Phenotype phase plane analysis 

First, the metabolic reconstruction was done using biochemistry, gene 
physiological data". Second, mass balance, capacity and thermodynam 
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3. I am very familiar with stoichiometric models of metabolism and have read U.S. 
application serial no. 09/923,870, by Palsson. I also am very familiar with Dr. Palsson's work, 
including the publication that is the basis of this application (Edwards and Palsson, Proc. Natl. 
Acad. Sci. U.S.A., 97:5528-33 (2000)). I understand that the invention described in this 
application is directed to constructing genome specific stoichiometric matrices that can be 
utilized with flux balance analysis for modeling metabolism. The application claims, in part, a 
method of simulating a metabolic capability by incorporating metabolic reactions through the use 
of genome information. I also understand that the claimed invention stands rejected for 
obviousness over the combination of references to Pramanik and Keasling., Biotech, and 
Bioengimering 56:398-421 (1997) in view of Blattner et al., Science 277:1453-69 (1997) and in 
view of Kunst et al., Rev. in Microbiol. 142:905-12 (1991). 

4. I know from personal knowledge that at the time Professor Palsson's invention 
was made respected and prestigious scientists in the field of metabolic engineering publically 
criticized Professor Palsson's invention and had a strong disbelief that it worked. 

5. In July of 1999 1 chaired a meeting session at the Biochemical Engineering XI 
meeting held in Salt Lake City, Utah. I also was a member of this meeting's Advisory 
Committee. Attached as Exhibit 2 is a copy of the final program for that meeting entitled 
Biochemical Engineering XI: Molecular Diversity in Discovery and Bioprocessing. 

6. During the meeting session that I chaired, Professor Palsson was giving a lecture 
on the use of metabolic models for simulations of growth and analysis of deletion mutants and 
other characteristics using the concepts of flux balance analysis. After Professor Palsson's 
lecture, Professor Jay Bailey from the auditorium requested whether he could provide some 
comments to the lecture. I agreed as I assumed that it was the normal type of questions and 
comments that are asked of lecturers at conferences, and considering Professor Baileys 
prominent position in the field of biochemical engineering and metabolic engineering I expected 
a good discussion. In fact Professor Bailey has published himself in the field of metabolic flux 
analysis, and had been working quite a lot on metabolic models. 

7. As it turned out, Professor Bailey had prepared 4-5 overhead slides analyzing and 
criticizing the approach taken by Professor Palsson. It was clear to me that there must have been 
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some earlier correspondence, and it also seemed like Professor Bailey had prepared "opposition" 
to another lecture than the one Professor Palsson had just given, as several of the critical points 
raised by Professor Bailey were in fact addressed in the lecture of Professor Palsson. 

8. The main criticism raised by Professor Bailey was that it was not possible to 
predict metabolic functions and cellular physiology by simply using constraint based 
simulations. He argued that due to the large degrees of freedom the predictions are not likely to 
represent true phenotypes. It is correct that there is a large degrees of freedom in the system and 
that constraint based simulations cannot predict correctly all the fluxes in the network, but 
correct phenotypes have been demonstrated in numerous examples later to be captured very well 
by this kind of simulation. The "opposition" from Professor Bailey was supported by several 
others in the auditorium, so it was quite clear that there was a general belief that the concept 
would not work. 
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statements are made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that any such willful false statement may jeopardize the validity of the application or 
any patent issued thereorjk— , 
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Douglas C. Cameron 
Cargill 
Gopal K. Chotani 
Genencor International 
Stephen W. Drew 
Merck & Co., Inc. 
Wei Shou Hu 
University of Minnesota 
Robert Kelly 
North Carolina State University 
Allen Laskin 
Laskin/Lawrence Associates 
Sang Yup Lee 
KAIST 
Vasantha Nagaragan 
Dupont 
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David Naveh 

Bayer Biotechnology 
Jens Nielsen 
Technical University of Denmark 
E. Terry Papoutsakis 
Northwestern University 
Greg Stephanopoulos 
Massachusetts Institute of Technology 
James R. Swartz 
Stanford University 
K. Dane Wittrup 
University of Illinois 
Toshiomi Yoshida 
Osaka University 
Dane W. Zabriskie 
SmithKline Beecham Laboratories 

Financial Sponsors of Biochemical Engineering XI Abbott Laboratories 
Amgen, Inc. 
Bayer Corporation 
Biogen 
Boehringer Ingelheim 
Bristol-Myers Squibb 
Covance Bio 
E.I. Du Pont de Nemours and Company 
Genencor International 
Genentech, Inc. 
Gist-Brocades 
Lilly Research Laboratories (Eli Lilly & Co.) 
Merck & Co., Inc. 
Novo Nordisk A/S 
National Science Foundation, U.S.A. 
Pfizer, Inc. The Proctor and Gamble Company 
Schering-Plough Research Institute 
Serono Pharmaceutical Research Institute 
U.S. Department of Army (Army Research Office) 
The Whitaker Foundation 

Poster Session A 
Poster Session B 

Sunday. July 25. 1999 

12:00 noon - 2:30 p.m. Conference Registration (Ballroom Atrium) 

2:30 p.m. - 3 :00 p.m. Welcoming Remarks (Ballroom I) 
George Georgiou, University of Texas at Austin 
Steven Lee, Merck & Co.j Inc. 
Allen Laskin, United Engineering Foundation 

PLENARY SPEAKERS: 
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3:00 p.m. - 4:00 p.m. Whole-Genome RNA and Protein Regulatory Networks in E-Coli 
and S. Cerevisiae 
G. Church 
Harvard University 

4:00 p.m. - 5:00 p.m. The Chemical Industry in the 21st Century: Role of 
Biotechnology 
J. Miller 
CTO, DuPont 

5:00 p.m. - 5:30 p.m. Break (Ballroom Atrium) 

5:30 p.m. - 6:30 p.m. Using New Gene Targets in Drug Discovery with Effective 
Throughput Screening Systems 
P. Fernandes 

CEO, Small Molecule Therapeutics 

6:30 p.m. - 7:30 p.m. Methods for Mining DNA Microarray Data 

G. Stephanopoulos 

Department of Chemical Engineering, Massachusetts Institute of Technology, USA 

7:30 p.m. - 9:00 p.m. Dinner (Ballroom II & III) 

9:00 p.m. - 10:30 p.m. Opening Reception (Ballroom Atrium) 

Monday. July 26. 1999 

7:00 a.m. - 8:1 5 a.m. Breakfast (Ballroom II & III) 

SESSION (a): EXPLOITING MICROBIAL DIVERSITY (Ballroom I) 
Session Chairs: R. Kelly, North Carolina State University 
V. Nagarajan, DuPont 

8:15 a.m. - 8:20 a.m. Introduction: Session Chairs 

8:20 a.m. - 9:00 a.m. Gene Acquisition in Bacterial Genome Evolution 
J. Roth 

University of Utah 

9:00 a.m. - 9:30 a.m. How Bacteria Talk to Each Other: Quorum Sensing in 
Escherichia Coli, Salmonella Typhimurium and Vibrio Harveyi 
L. Bassler 

Princeton University 

9:30 a.m. - 10:00 a.m. Predicting the Consequences of Genome Reorganization 
I. Molineux 

University of Texas at Austin 

10:00 a.m. - 10:30 a.m. Coffee Break (Ballroom Atrium) 
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10:30 a.m. - 1 1 :00 a.m. Hyperthermophilic Genome Sequences: So Many Interesting 
Enzymes, So Little Time 
R. Kelly 

North Carolina State University 

1 1 :00 a.m. - 1 1 :30 a.m. Purification, Characterization and Process Considerations of 
Cryophilic Proteases of Marine Origin 
J. Asenjo 

University of Chile 

1 1 :30 a.m. - 12:00 noon Harvesting Biomolecules for the Tree of Life 
E.Mathur 

Diversa Corporation 

12:00 noon - 12:30 p.m. Genetic and Biochemical Diversity of Bacterial in Commercial 
Wastewater Biofeactors 
V. Nagarajan 
DuPont Company 

12:30 p.m.- 1 :45 p.m. Lunch (Ballroom II & III) 

1 :45 p.m. - 4:00 p.m. Ad hoc Sessions and/or free time 

SESSION (b): NOVEL HIGH THROUGHPUT TECHNOLOGIES IN DISCOVERY (Ballroom I) 
Session Chairs: G. Georgiou, University of Texas at Austin 
J. Chalmers, Ohio State University 

4:00 p.m. - 4:05 p.m. Introduction: Session Chairs 

4:05 p.m. - 4:35 p.m. Non-Invasive Monitoring of Pathogens, Tumor Cells and Gene 
Expression in a Living Animal 
P. Contag 

CEO, Xenogen Corp. 

4:35 p.m. - 5 :05 p.m. Directed Enzyme Evolution on Bacterial Surfaces 
B.L. Iverson 

University of Texas at Austin 

5 :05 p.m. - 5 :35 p.m. Lead Identification from Encoded Synthetic Combinatorial 
Libraries Via a Self-Selective, Positive Feedback Reporter System 
G. Lalonde 

Affymax Research Institute 

5:35 p.m. - 6:00 p.m. Coffee Break (Ballroom Atrium) 

6:00- p .m. - 6 : 3 0 p.m. Transcript Profile and Proteomics-Based Views of Biological 
Models for Pharmaceutical Development 

J. Seilhamer - 
Incyte Pharmaceuticals, Inc. 
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6:30 p.m. - 7:00 p.m. Microfabricated Chemical Analyses Systems 
Mark Burns 
University of Michigan 

7:00 p.m. - 7:30 p.m. Exploitation of Immunomagnetic Labeling: Rapid Cell Sorting 
and Fractionation 
J. Chalmers 

Ohio-State University, USA 

7:30 p.m. - 9:00 p.m. Dinner (Ballroom II & III) 

9:15 p.m. - 10:30 p.m. Poster Session A and Social Hour (Ballroom Atrium) 
Tuesday. July 27. 1999 

7:00 a.m. - 8:25 a.m. Breakfast (Ballroom II & III) 

SESSION (C) : DISCOVERY AND DESIGN OF MACROMOLECULES (Ballroom I) 
Session Chairs: B.L. Iverson, University of Texas at Austin 
V. Schellenberger, Genencor International, USA 

8:25 a.m. - 8:30 a.m. Introduction: Session Chairs 

8:30 a.m. - 9:00 a.m. Evolution ofRibozymes, Proteins and Pathways 
A. Ellington 

University of Texas at Austin 

9:00 a.m. - 9:30 a.m. Directed Evolution of Subtilisin 

V. Schellenberger 
Genencor International, USA 

9:30 a.m. - 10:00 a.m. Directed Evolution of Protein Recognition. Stability and 
Expression by Yeast Surface Display 
K.D. Wittrup 
University Illinois 

10:00 a.m. - 10:30 a.m. Coffee Break (Ballroom Atrium) 

10:30 a.m. - 11:00 a.m. Creation of New Cytotoxic Antitumor Antibodies by 
Glycosylation Engineering 
J.E. Bailey 

Institute of Biotechnology, ETH Zurich, Switzerland 

1 1 :00 a.m. - 1 1 :30 a.m. Molecular Breeding of Genes, Pathways and Genomes by 
DNA Shuffling 
W.P.C. Stemmer 
Maxygen, Inc., USA 

1 1 :30 a.m. - 12:00 noon Discussion 
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12:00 noon - 1 :30 p.m. Lunch (Ballroom II & III) 

1 :45 p.m. -3:00 p.m. Ad hoc Sessions and/or free time 

SESSION (d): BIOINFORMATICS (Ballroom I) 

Session Chairs: J. Nielsen, Technical University Denmark, Denmark 

B.O. Palsson, University of California at San Diego 

3 : 1 5 p.m. - 3 :45 p.m. Definition of the E. Coli Metabolic Genotype: Basic Concepts, 
Scientific and Applied Uses 
B.O. Palsson 

University of California at San Diego 

3:45 p.m. - 4:15 p.m. Strategies for Prediction of Orphan Protein Function 
S. Brunak 

Technical University of Denmark 

4:15 p.m. - 4:45 p.m. Pathmap; A New Tool for Visualization of Gene Expression Data 
John Rogers 

Parke-Davis Pharmaceutical Research 

4:45 p.m. - 5:15 p.m. Bioinformati.cs in the Postgenomic Era 
S. Subramanian 
University of Illinois UC 

5:15 p.m. - 5:45 p.m. Coffee Break (Ballroom Atrium) 

SESSION (e): DISCOVERY AND PRODUCTION OF BIOACTIVE SMALL MOLECULES 
Session Chairs: A.E. Barron, Northwestern University 
D.S. Clark, University of California - Berkeley 

5:45 p.m. - 6:15 p.m. Integrating Biocatalysis into the Drug Discovery Pipeline 
D.S.Clark 

University of California - Berkeley 

6:15 p.m. - 6:45 p.m. A Stress Promoter-Based System for Antibiotics Screening 
F. Baneyx 

Department of Chemical Engineering, University of Washington, USA 

6:45 p.m. - 7:15 p.m. The Challenge of Ultra-High Throughput Screen Technology: 
Implementation of Biological Assays to 3,456 Well Format 
L. Mere 

Aurora Biosciences Corporation 

7:15 p.m. - 7:45 p.m. Peptoids: A High Diversity Family of Synthetic, Protease-Stable 
Peptide Mimics 
A.E. Barron 

Chemical Engineering Department, Northwestern University, USA 
7:45 p.m. - 8:00 p.m. Discussion 
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8:00 p.m. - 9:30 p.m. Dinner (Ballroom II & III) 

9:30 p.m. - 10:30 p.m. Social Hour (Ballroom Atrium) 

Wednesday. July 28. 1999 

7:00 a.m. -8:15 a.m. Breakfast (Ballroom II & III) 

SESSION (f): HIGH THROUGHPUT BIOPROCESING (Ballroom I) 
Session Chairs: S. Ozturk, Bayer Corporation 
S. Drew, Merck & Co. 

8:15 a.m. - 8:20 a.m. Introduction: Session Chairs 

8:20 a.m. - 8:50 a.m. An "Animal" on a Chip: Preclinical Evaluation of 

Pharmaceuticals 

M. L. Shuler 

School of Chemical Engineering 
Cornell University, USA 

8:50 a.m. - 9:20 a.m. An Efficient Integrated Approach to Bioprocess Development 

R. Greasham 
Merck & Co. 

9:20 a.m. - 9:50 a.m* Biocatalytic Systhesis of Chiral Intermediate for Anti- 
Hypertension Drugs 
RPatel 

Bristol-Myers Squibb Pharmaceutical Research Institute 

9:50 a.m. - 1 0:20 a.m. Coffee Break (Ballroom Atrium) 

1 0:20 a.m. - 1 0:50 a.m. Transient Gene Expression for Biotech Research 
A. Bernard 

Serono Pharmaceutical Research Institute, S.A., Switzerland 

1 0:50 a.m. - 11:20 a.m. High Throughput Bioprocessing: New Directions in 

Pharmaceutical Drug Development 

S. Ozturk 

Bayer Corporation 

1 1 :20 a.m. - 1 1 :50 .a.m. Protein Production in Transgenic Plants 
V. Paradkar 
Monsanto Company 

1 1 :50 a.m. - 12:00 noon Discussion 

12:00 noon - 1 :30 p.m. Lunch (Ballroom II & III) 

1 :30 p.m. - 4:00 p.m. Poster Session B/ Ad hoc Sessions and/or free time (Ballroom Atrium) 
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SESSION (g) : CONTROLLING PRODUCT HETEROGENEITY (Ballroom I) 
Session Chairs: E.T. Papoutsakis, Northwestern University 
M.L. Shuler, Cornell University 

4:15 p.m. - 4:20 p.m. Introduction: Session Chairs 

4:20 p.m. - 4:50 p.m. Genetic and Physiological Manipulation of the Protein 
Gfycosylation Pathway 
P. Jenkins 

Lilly Research Laboratories, Eli Lilly & Co. 

4:50 p.m. - 5:20 p.m. Controlling Secretory Processing to Improve Product Quality 
and Homogeneity 
M.J. Betenbaugh 

Department of Chemical Engineering, Johns Hopkins University, USA 

5 :20 p.m. -5:50 p.m. DNA Microarray for Metabolic Engineering 
J.C. Liao 

Deparment of Chemical Engineering, University of California Los Angeles, USA 
5:50 p.m. -6:15 p.m. Coffee Break (Ballroom Atrium) 

6:15 p.m. - 6:45 p.m. The Domain-Bypass Mechanism for Creation of Structural 
Diversity in the Pikromycin Biosynthetic Gene Cluster 
D. Sherman 

University of Minnesota 

6:45 p.m. - 7:15 p.m. Sugars to Plastics, then to Fine Chemical Monomers: Production 

of Various Enantiomerically Pure ®- Hydroxycarboxylic Acids 

S.YLee 

Chemical Engineering Department, Korea Advanced Institute of Science and Technology, Korea 

7:30 p.m. - 10:00 p.m. Conference Banquet/AMGEN Award and Social Hour (Ballroom II & III) 
AMGEN Award Lecture 

Cell Engineering: Understanding and Controlling Receptor Processes 
Douglas A. Lauffenburger 
Massachusetts Institute of Technology 

Thursday. July 29. 1999 

7:00 a.m. - 8:15 a.m. Breakfast (Ballroom II & III) 

SESSION (h): DISCOVERY OF BIOLOGICAL MATERIALS (Ballroom I) 
Session Chairs: D. Kaplan, Tufts 
S.Y. Lee, KAIST, Korea 

8:15 a.m. - 8:20 a.m. Introduction: Session Chairs 

8;20 a.m. - 8:50 a.m. Bioengineered Microbial Lipopolysaccharides and Glycolipids 
D. Kaplan 
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Tufts 

8:50 a.m. - 9:20 a.m. Bioactive Biomaterials for Engineering Tissue Healing 
J.A. Hubbell 
ETH Zurich 

9:20 a.m. - 9:50 a.m. Molecular Weight Control in Bacterial Hyaluronic Acid 

Production 

L. Nielsen 

Chemical Engineering Department, University of Queensland, Australia 
9:50 a.m. - 10:20 a.m. Coffee Break (Ballroom Atrium) . 

10:20 a.m. - 10:50 a.m. S.-Layers: Fundamentals and Applications in Molecular 
Nanotechnology and Biomimetics 
Prof. U.B. Sleytr 

University of Agricultural Sciences, Vienna, Austria 

10:50 a.m. - 1 1 :20 a.m. Bioinductive Polymer Coatings for Implantable Glucose 
Biosensors 
J. Koberstein 
University of Connecticut 

1 1 :20 a.m. - 1 1 :50 a.m. Properties and Biotechnological Applications of the Azotobacter 
Vinelandii Modular Type Mannuronan C-5 Epimerases 
S. Valla 

Norwegian University of Science and Technology 

1 1 :50 a.m. - 12:00 noon Discussion 

12:00 noon - 1 :30 p.m. Lunch (Ballroom II & III) 

1 :30 p.m. - 3 :30 p.m. Ad hoc Sessions and/or free time 

SESSION (i): PRODUCTION OF COMPLEX PRODUCTS (Ballroom I) 
Session Chair: J.R. Swartz, Stanford University 

3 :45 p.m. - 3 :50 p.m. Cell Free Protein Synthesis for Discovery and Production 
J.R. Swartz 
Stanford University 

4:20 p.m. - 4:50 p. m. Production of Viral Vaccines: A Role For Biochemical 
Engineering 

D. Robinson 
Merck & Co., USA 

4:50 p.m. - 5:20 p.m. 02 and its Transport in Hematopoietic Life and Death 

E. T. Papoutsakis 
Northwestern University, USA 
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5:20 p.m. - 5:45 p.m. Coffee Break (Ballroom Atrium) 

5:45 p.m. - 6:15 p.m. Stem Cell-Based Cellular and Tissue Engineering 
P. Zandstra 

University of Toronto, Canada 

6:15 p.m. - 6:45 p.m. The Production ofPlasmidDNAfor Gene Therapy: The Cell 
Lysis Step 
A.W. Nienow 

Centre for Bioprocess Engineering, School of Chemical Engineering, The University of Birmingham, 
UK 

6:45 p.m. - 7:15 p.m. Manipulation of the Glycosylation of Recombinant Proteins 
Produced in Cho Cells by Overexpression of Glycosyltransferase 
L. Krummen 
Genentech, Inc., USA 

7:15 p.m. - 7:30 p.m. Discussion 

7:30 p.m. - 9:00 p.m. Dinner (Ballroom II & III) 

9:00 p.m. - 10:00 p.m. Social Hour (Ballroom Atrium) 

Friday. July 30. 1999 

7:00 a.m. - 8:30 a.m. Breakfast (Ballroom II & III) 

8:30 a.m. - 10:00 a.m. Wrap-up Discussion (Ballroom I) 

10:00 a.m. - 12:00 noon Free Time 

12:00 noon - 1 :30 p.m. Lunch (Ballroom II & III) 

1:30p.m. Conference Adj ournment 

Return to top 

Poster Session A 

A - 1 Cytoplasmic Expression of Disulfide Bonded Proteins in E. Coli 
Paul H. Bessette and George Georgiou 

Department of Chemical Engineering, University of Texas at Austin, USA 
Xiaoming Zhang 

Department of Molecular Biology, University of Texas at Austin, USA 

A - 2 Secretory Production of Recombinant Proteins Using an Artificial Signal Sequence 
Sang Yup Lee and Jong Hyun Choi 

Chemical Engineering Department, Korea Advanced Institute of Science and Technology, Korea 
A - 3 High Level Production of Human Leptin and Its Purification 
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Sang Yup Lee and Ki Jun Jeong 

Department of Chemical Engineering, Korea Advanced Institute of Science and Technology, Korea 

A - 4 Amino Acid Depletion Model and Experimental Validation for Recombinant E. coli 
Kevin J.-R. Clark, Fu'ad Haddadin and Sarah W. Harcum 

Department of Chemical Engineering, New Mexico State University, Las Cruces, NM 

A - 5 Characterisation ofPlasmid DNA Production in Escherichia Coli Fermentation for Gene 
Therapy 

R.D. O'Kennedy and E. Keshavarz-Moore 

Advanced Centre for Biochemical Engineering, Department of Biochemical Engineering, University 
College London, UK 

A - 6 Extremophilic Enzymes: Heterologous Gene Expression and Use as Biocatalysts 

Dr. Julian B. Chaudhuri . Dr. Helen Connaris, Gerald Sellek, Prof. Michael Danson and Dr. David 

Hough 

Department of Chemical Engineering, University of Bath, UK 

A -7 Production of Antifungal Molecules Using Hydrothermal Marine Bacteria 
Joseph Boudrant 

Laboratory of Chemical Engineering Sciences-CNRS, France 
Joel Coulon and Roger Bonaly 

UMR UHP-CNRS, Biochimie Microbienne, Faculte de Pharmacie, France 
Georges Barbier 

Laboratory of Microbiology, IFREMER, Centre de Brest, France 
Jacques Dietrich 

Laboratory of Biotechnology, IFREMER, Centre de Brest, France 

A - 8 Extracellular Proteinases from Psychrotrophs: Keratinase 
Ouamrul Hasan and Daniel G. Moran 

Procter & Gamble Far East, Inc., Research & Development Division, Japan 
Yasutaka Morita and Eiichi Tamiya 

Japan Advanced Institute of Science and Technology, The School of Materials Science, Japan 

A - 9 Study of Dynamic Response to Growth Rate Oscillation in Recombinant Bacillus Subtilis 

Culture 

S. Chuen-Im 

Biotechnology and Biochemical Engineering Group, Department of Food Science and Technology, 
University of Reading, UK 
D.L. Pyle and H.C. Lynch 

Department of Food Science & Technology, University of Reading, UK 

A - 10 Modelling ofBaeyer- Villiger Monooxygenase Biocatalytic Reactions 

Matthew Hogan and John Woodley 

Department of Biochemical Engineering, University College London, UK 

A - 11 Axial Flow Up-Pumping Impellers in Industrial Fermentation Processes 

Abbott Laboratories, USA 
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A - 12 Is the Production of Biomaterials by Fermentation Economically Viable? 
Anton PJ Middleberg 

Department of Chemical Engineering, University of Cambridge, UK 
Yao Ling and Richard J. van Wegen 
University of Adelaide, Australia 

A - 13 Mathematical Model of Inorganic Phosphate Controlled Expression of Dengue Envelope 
Protein by Saccharomyces Cerevisiae 

Anan Tongta 

School of Bioresources and Technology, King Mongkut's University of Technology Thonbury, Thailand 
Chulee Yompakdee 

Pilot Plant Development and Training Institute, King Mongkut's University of Technology Thonbury, 
Thailand 

A - 14 Production and Recovery of Human Endostatin from Pichia Pastoris 

Joseph Shiloach , Loc Trinh and Santosh Noronha 

Biotechnology Unit, LCDB, NIDDK, National Institute of Health, USA 

A- 15 NADH Oscillations in Brewers Yeast 

Claus Emborg 

Center for Process Biotechnology, Department of Biotechnology, Technical University of Denmark, 

Denmark 

Jes Tobiassen 

Alfred J0rgensen Lab, Copenhagen 

A - 16 Exploiting the Diversity in Cell Wall Hydrolases of Trichoderma Reesei 

H.J. Meerman 

Genencor International, Inc. 

A - 17 Cultivation and Direct Regeneration System of Embryogenic Rice Callus Using Macroporous 
Support 

Hiroyuki Honda . Kwan Hoon Moon, Chunzhao Liu and Takeshi Kobayashi 

Department of Biotechnology, Graduate School of Engineering, Nagoya University, Japan 

A - 18 Glycosylational Engineering in Insect Cells: Effects of Gal T Overexpression 

Michael J. Betenbaugh . Eric Ailor and Y.C. Lee 

Department of Chemical Engineering, Johns Hopkins University, USA 

Noriko Takahashi 

Nakano Vinegar Co. Ltd. 

Donald Jarvis 

University of Wyoming, USA 

A - 19 Manipulation of the Apoptosis Pathway in Mammalian Cell Culture 

Tina M. Sauerwald . Bruno Figueroa, Jr., J. Marie Hard wick and Michael J. Betenbaugh 
Department of Chemical Engineering, Johns Hopkins University, USA 

A - 20 Impact of Dolichol Monophosphate Supplementation on Recombinant Gamma-Interferon 
Glycosylation 

Inn H. Yuk and Prof. Daniel I.C. Wang 
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Massachusetts Institute of Technology, USA 

A - 21 Strategies for Production of High Titer Herpes-Based Gene Therapy Vectors 
J.C. Wetchuck, A. Ozuer, B. Russell and M.M. Ataai 
Chemical Engineering, University of Pittsburgh, USA 
J. Glorioso 

Molecular Biology and Biochemistry, University of Pittsburgh, USA 

A - 22 High Density Culture of Panax Notoginseng Cells in Bioreactors for Saponin Production 
H. Yao and J. J. Zhong 

State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 
China 

A - 23 Phase Formation Behavior and Kinetics of the Separation of Phases and Proteins in Aqueous 
Two-Phase Systems 

B. A. Andrews . E. Huenupi, R. Munoz, and J. A. Asenjo 

Center for Biochemical Engineering and Biotechnology, Department of Chemical Engineering, 
University of Chile, Santiago, Chile 

A - 24 Effects of Bone Marrow Architectures on Oxygen Tensions and Gradients Experienced by 
Hematopoietic Cells in vivo 
Dominic Chow 
Northwestern University 

A - 25 Molecular Analysis of spoOA in Clostridium Acetobutylicum - Progress Towards Decoupling 
Stationary Phase Phenomena in Solventogenic Bacteriacteria 

Latonia M. Harris. 
Northwestern University 

A - 26 Surface Display of Enzymes as a Selective Screening System 
Jae-GuPan 

Korea Research Institute of Bioscience and Biotechnology 
Return to top 

Poster SessionB 
B - 1 Engineering Microorganisms for Heavy Metal Removal 

Douglas S. Clark . J.D. Keasling, Clifford L. Wang, Sang-Weon Bang and Andrew C. Magyarosy 
Department of Chemical Engineering, University of California-Berkeley, USA 

B - 2 Application of Flow Cytometry to Study Substrate and Product Toxicity in the Indene 

Byconversion 

Ashraf Amanullah 

Merck and Co. / University College London, Department of Biochemical Engineering, UK 

C. J. Hewitt, A.W. Nienow 

School of Chemical Engineering, The University of Birmingham, UK 
M. Chartrain, B.C. Buckland and S.W. Drew 

Department of Bioprocess Research and Development, Merck Research Laboratories, Merck and Co. . 
Inc., USA 
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J.M.Woodley 

Department of Biochemical Engineering, University College London, UK 

B - 3 Sedimentation of Manganese/Iron Ore Fines from Wash Water of an Ore Washing Plant 

A. D. Agate 

Agharkar Research Institute 

B - 4 DNA Cleavage and Biological Activities of Substituted Naphthyle Imides 
Dong-Zhi Wei . Dong-Hui Zhu and Jiang-Chao Qian 

Research Institute of Biochemistry, State Key Laboratory of Bioreactor Engineering, East China 
University of Science & Technology, China 
Tian-Bao Huang and Xu-Hong Qian 

Institute of Pharmaceuticals & Pesticides, East China University of Science & Technology 

B - 5 Byconversion in a Membrane Bioreactor-Experimental and Computer Simulated Comparisons 
Preeta Tyagi 
Youngsoft, Inc. 
Prof. S.N, Upadhyay 
Institute of Technology, India 

B-6A Rapid and Rational Method for Selection of Cof actor Requiring Processes 
Katie C. Thomas arid John M. Woodley 

The Advanced Centre for Biochemical Engineering, Department of Biochemical Engineering, 
University College London, UK 

B - 7 Directed Discovery and Analysis of Antisense Oligonucleotides 

Martin L. Yarmush , Charles M. Roth, S. Patrick Walton, and Arul Jayaraman 
Center for Engineering in Medicine, Massachusetts General Hospital, USA 

B - 8 Design and Discovery ofPoly-(l-3)-Trans-(2-2)-GPDDfromAcholeplasma Ldidlawii 
L.L. Matz 

Matz & Associates, USA 

B - 9 Thermostable Peptide Ligase in DMF 
Liuqin Zhu and Wu Yukie 

Institute of Biophysics, Chinese Academy of Sciences, China 

Yang Yonghua and Yang Shengli 

Shanghai Research Center of Biotechnology, CAS, China 

B - 10 At Play in the Fields of Molecular Biology: Antibody Affinity Maturation via Combinatorial 
Genetics 

Jennifer Anne Maynard . Brent L. Iverson and George Georgiou 
Department of Chemical Engineering, University of Texas at Austin 

B - 11 Protein Engineering of a Lytic b-l,3-Glucanase Enzyme able to Permabilize the Yeast Cell 
Wall 

B. A. Andrews. A. Olivera, M.Casas, O. Salazar, J. Molitor and J. A. Asenjo 

Center for Chemical Engineering, Department of Chemical Engineering, University of Chile, Chile 

B - 12 Aqueous Two-Phase Processes for the Recovery of Aroma Compounds Produced by Micelial 
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Cultures 

Marco Rito-Palomares and Alejandro Negrete 
Centro de Biotecnologia-ITESM, Mexico 
Leobardo Serrano and Enrique Galindo 
Instituto de Biotecnologia-UNAM, Mexico 

B - 13 Peptide Affinity Chromatography of Human Clotting Factor VIII: Column Experiments with 

Peptides Derived From the Factor VIII-Binding von Willebrand Factor Region 

Karin Amatschek . Rainer Hahn, Roman Necina and Alois Jungbauer 
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386. 


URL nslij-genetics.org/search_omim.html, Online Mendelian Inheritance in 
Man database, Center for Medical Genetics, Johns Hopkins University 
(Baltimore, MD) and National Center for Biotechnology Information, National 
Library of Medicine (Bethesda, MD).*** 
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URL qiagen.com, Qiagen RNeasy Mini Kit.*** 
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URL rana.lbl.gov/EisenSoftware.htm, "Cluster" software.*** 






389. 


URL genome-www.stanford.edu/~sherlock/cluster.html, "XCluster" software.*** 






390. 


URL systembiology.ucsd.edu.*** 






391. 


URL tigr.org, ,The Institute for Genome Research, J Craig Venter Institute.*** 






392. 


URL tula.cifn.unam.mx:8850/regulondb/regulon_intro.frameset. *** 
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URL workbench.sdsc.edu/, Biology Workbench.*** 
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